Statistics Formula for Standard Deviation
What Standard Deviation Actually Is
Standard deviation measures how spread out numbers are from their average. That's it. A low standard deviation means numbers cluster close together. A high one means they're all over the place.
You use this formula when you want to know if your data points are consistent or wildly different. It's one of the most practical stats tools you have.
The Standard Deviation Formulas
There are two versions. Pick the wrong one and your results are wrong.
Population Standard Deviation Formula
Use this when your data includes every single member of the group you're studying.
σ = √[Σ(xi - μ)² / N]
Where:
- σ = population standard deviation
- xi = each individual value
- μ = the population mean
- N = total number of values
Sample Standard Deviation Formula
Use this when your data is just a sample from a larger group. This is what you'll use most often in real research.
s = √[Σ(xi - x̄)² / (n-1)]
Where:
- s = sample standard deviation
- xi = each individual value
- x̄ = the sample mean
- n = sample size
The (n-1) is called Bessel's correction. It corrects the bias that happens when you estimate a population value from a sample.
Population vs Sample: When to Use Which
Most of the time, you're working with samples. Here's a quick way to decide:
- You measured every single person in a study? Population formula.
- You surveyed 500 people out of a million? Sample formula.
- You're analyzing stock prices for a specific day? Population formula.
- You're predicting future stock behavior based on past data? Sample formula.
If you're unsure, go with the sample formula. Using population standard deviation on sample data underestimates the true variability in the population.
How to Calculate Standard Deviation: Step by Step
Let's use this data set: 2, 4, 4, 4, 5, 5, 7, 9
Step 1: Find the Mean
Add all values and divide by the count.
(2 + 4 + 4 + 4 + 5 + 5 + 7 + 9) / 8 = 40 / 8 = 5
Step 2: Subtract the Mean from Each Value
2 - 5 = -3
4 - 5 = -1
4 - 5 = -1
4 - 5 = -1
5 - 5 = 0
5 - 5 = 0
7 - 5 = 2
9 - 5 = 4
Step 3: Square Each Result
9, 1, 1, 1, 0, 0, 4, 16
Step 4: Add All Squared Values
9 + 1 + 1 + 1 + 0 + 0 + 4 + 16 = 32
Step 5: Divide (With or Without -1)
For a sample: 32 / (8-1) = 32 / 7 = 4.57
For a population: 32 / 8 = 4
Step 6: Take the Square Root
Sample standard deviation: √4.57 = 2.14
Population standard deviation: √4 = 2
Variance vs Standard Deviation
Variance is just the standard deviation squared. You skip the final square root step.
Variance is useful in advanced statistics, like ANOVA or regression. Standard deviation is more intuitive because it's in the same units as your original data.
Tools for Calculating Standard Deviation
You don't need to do this by hand. Here's what actually works:
| Tool | Best For | Cost |
|---|---|---|
| Excel/Google Sheets (STDEV.S / STDEV.P) | Quick calculations, small to medium datasets | Free to $70 |
| Python (NumPy) | Large datasets, automation, reproducible analysis | Free |
| R | Statistical research, academic work | Free |
| Online calculators | One-off calculations, no software install | Free |
| TI-84 calculator | Classroom exams, standardized testing | $80-150 |
Excel example: =STDEV.S(A1:A100) for sample, =STDEV.P(A1:A100) for population.
Common Mistakes That Ruin Your Results
Mistake 1: Using population formula on sample data. This is the most common error. Your standard deviation will be too low.
Mistake 2: Mixing up standard deviation with standard error. Standard error measures how much the sample mean varies from the true population mean. Different thing entirely.
Mistake 3: Forgetting to square the deviations. If you don't square before summing, negative and positive deviations cancel out. You get zero every time.
Mistake 4: Using standard deviation for skewed data. Outliers mess up standard deviation badly. For skewed distributions, consider median absolute deviation instead.
What a "Good" Standard Deviation Looks Like
There's no universal answer. Context matters completely.
A standard deviation of 15 in IQ scores is normal. A standard deviation of 15 in measured machine parts means your manufacturing process is failing.
The coefficient of variation (standard deviation divided by mean) helps compare variability across different types of data. A CV under 1 is usually considered low variability.
When Standard Deviation Misleads You
Standard deviation assumes your data follows a normal distribution (bell curve). If your data is skewed, bimodal, or has heavy tails, standard deviation stops being meaningful.
Always plot your data first. Histogram or box plot. If the shape looks nothing like a bell curve, reconsider what metric you're using.
Salary data is a classic example. Most people cluster at low salaries, with a long tail of high earners. Standard deviation makes it look like salaries are more evenly spread than they actually are.
Quick Reference Cheat Sheet
- Mean = sum of values / count of values
- Deviation = each value minus the mean
- Squared deviation = deviation²
- Sum of squared deviations = add all squared deviations
- Variance = sum of squared deviations / (n or n-1)
- Standard deviation = √variance
That's the entire process. Memorize the steps, not the formula. Once you understand what each step does, the formula makes sense.