Standard Deviation for a Sample- Calculation Guide
What Is Sample Standard Deviation?
Standard deviation measures how spread out numbers are from their average. Sample standard deviation is what you use when your data is just a slice of a larger group — not the entire population.
The difference matters. Using population standard deviation on a sample gives you wrong results. Most real-world situations involve samples. Surveys, experiments, quality tests — you're almost never measuring everything.
Sample vs Population Standard Deviation
The formulas look almost identical, but the denominator is different:
- Population standard deviation: divide by N (the total count)
- Sample standard deviation: divide by n-1 (Bessel's correction)
You divide by n-1 because a sample tends to underestimate the true spread. The correction gives you a more accurate estimate of the actual population variance.
When to Use Each
- Use sample standard deviation when: you're working with a subset, taking measurements from a larger group, or analyzing survey data
- Use population standard deviation when: you have every single data point in the entire group
The Formula
Here it is in all its glory:
s = √[Σ(xi - x̄)² / (n-1)]
Where:
- s = sample standard deviation
- xi = each individual value
- x̄ = the sample mean
- n = number of data points
Looks intimidating. It's not. Let's walk through it.
Step-by-Step Calculation
Let's use this dataset: 4, 8, 6, 5, 3
Step 1: Find the Mean
Add everything up and divide by how many numbers you have.
(4 + 8 + 6 + 5 + 3) / 5 = 26 / 5 = 5.2
Step 2: Find the Deviations
Subtract the mean from each value:
- 4 - 5.2 = -1.2
- 8 - 5.2 = 2.8
- 6 - 5.2 = 0.8
- 5 - 5.2 = -0.2
- 3 - 5.2 = -2.2
Step 3: Square the Deviations
- (-1.2)² = 1.44
- (2.8)² = 7.84
- (0.8)² = 0.64
- (-0.2)² = 0.04
- (-2.2)² = 4.84
Step 4: Sum the Squared Deviations
1.44 + 7.84 + 0.64 + 0.04 + 4.84 = 14.8
Step 5: Divide by n-1
We have 5 numbers, so: 14.8 / (5-1) = 14.8 / 4 = 3.7
Step 6: Take the Square Root
√3.7 = 1.92
That's your sample standard deviation. About 1.92.
Quick Comparison Table
| Metric | Formula | When to Use |
|---|---|---|
| Population SD | √[Σ(xi-μ)² / N] | Full dataset available |
| Sample SD | √[Σ(xi-x̄)² / (n-1)] | Subset of larger group |
| Population Variance | Σ(xi-μ)² / N | Same as population SD, unsquared |
| Sample Variance | Σ(xi-x̄)² / (n-1) | Same as sample SD, unsquared |
Common Mistakes
Dividing by N instead of n-1 — This is the most frequent error. If you're analyzing sample data and you divide by the full count, your result is biased downward.
Forgetting to square the deviations — The negatives cancel out if you skip this. Your answer will be zero, which is meaningless.
Using the wrong mean — Make sure you're calculating the mean from your actual data, not from some external source.
Rounding too early — Keep full precision through the calculation. Round only at the end.
How to Calculate in Excel or Google Sheets
You don't have to do this by hand every time. Excel has built-in functions:
- =STDEV.S(range) — sample standard deviation
- =STDEV.P(range) — population standard deviation
Google Sheets uses the same syntax. STDEV.S calculates using n-1. STDEV.P uses N.
Mix them up and your numbers will be wrong. Check which one you're using.
How to Calculate in Python
NumPy handles this cleanly:
import numpy as np data = [4, 8, 6, 5, 3] sample_std = np.std(data, ddof=1) # ddof=1 means sample (n-1) print(sample_std) # Output: 1.92
The ddof=1 parameter is the key. It tells NumPy to divide by n-1 instead of n. Default ddof is 0, which gives population standard deviation.
What Does the Number Actually Mean?
A standard deviation of 1.92 in our example means the typical distance from any data point to the mean is about 1.92 units.
Low standard deviation = data clusters tightly around the mean.
High standard deviation = data is spread out widely.
In finance, a stock with high standard deviation is volatile. In manufacturing, high standard deviation means inconsistent quality. Context determines whether a high or low value is good.
When Sample Size Matters
The n-1 correction works best when n is small relative to the population. As your sample grows, the difference between dividing by n and n-1 becomes negligible.
For n=5, n-1 = 4. That's a 20% difference. For n=1000, n-1 = 999. The difference is 0.1% — essentially nothing.
If your sample is large enough, the correction matters less. But you should still use it correctly.
Bottom Line
Sample standard deviation is the right choice almost always. The formula is straightforward: find the mean, calculate deviations, square them, sum them, divide by n-1, take the square root.
Use STDEV.S in spreadsheets. Use ddof=1 in NumPy. Don't mix these up.