Statistics Standard Deviation Formula- Complete Guide
What Is Standard Deviation?
Standard deviation measures how spread out numbers are in a dataset. It's the square root of variance. That's the short version.
Most people get stuck here. They memorize formulas without understanding what they're actually measuring. Let me fix that.
Imagine you have two classrooms:
- Class A: Test scores of 70, 70, 70, 70, 70
- Class B: Test scores of 50, 60, 70, 80, 90
Both classes have the same mean (70). But Class A is identical. Class B varies wildly. Standard deviation captures this difference in a single number.
Class A has a standard deviation of 0. Class B has a standard deviation around 14.14. The higher the number, the more spread out your data is.
The Standard Deviation Formula
There are actually two formulas—one for populations and one for samples. Using the wrong one gives you wrong answers.
Population Standard Deviation Formula
σ = √[Σ(xi - μ)² / N]
Where:
- σ = population standard deviation
- xi = each value in the dataset
- μ = population mean
- N = total number of values
Sample Standard Deviation Formula
s = √[Σ(xi - x̄)² / (n-1)]
Where:
- s = sample standard deviation
- xi = each value in the sample
- x̄ = sample mean
- n = sample size
The difference is n-1 instead of n. This correction (Bessel's correction) accounts for the fact that a sample underestimates the true spread of a population.
Population vs. Sample: Which One Do You Use?
This trips up a lot of people. Here's the rule:
- Population: You have data for EVERYONE in the group you're studying. You're measuring the entire group.
- Sample: You have data for only a portion of the group. You're making inferences about the larger population.
Real-world example: If you're analyzing test scores for every student in a specific school, that's your population. If you're analyzing scores for 50 students chosen from that school to estimate how the whole district performs, that's a sample.
In research and statistics work, samples are far more common. You'll use the sample formula most often.
Step-by-Step: How to Calculate Standard Deviation
Let's work through an example. You have the following dataset: 4, 8, 6, 5, 3
Step 1: Find the Mean
Add all values and divide by the count.
(4 + 8 + 6 + 5 + 3) / 5 = 26 / 5 = 5.2
Step 2: Find the Deviations from the Mean
Subtract the mean from each value:
- 4 - 5.2 = -1.2
- 8 - 5.2 = 2.8
- 6 - 5.2 = 0.8
- 5 - 5.2 = -0.2
- 3 - 5.2 = -2.2
Step 3: Square Each Deviation
- (-1.2)² = 1.44
- (2.8)² = 7.84
- (0.8)² = 0.64
- (-0.2)² = 0.04
- (-2.2)² = 4.84
Step 4: Sum the Squared Deviations
1.44 + 7.84 + 0.64 + 0.04 + 4.84 = 14.8
Step 5: Divide by N (or n-1)
For a population: 14.8 / 5 = 2.96
For a sample: 14.8 / 4 = 3.7
Step 6: Take the Square Root
Population SD: √2.96 = 1.72
Sample SD: √3.7 = 1.92
That's it. Six steps. Practice this process until it becomes automatic.
Standard Deviation vs. Variance
Variance is the average of squared deviations from the mean. Standard deviation is the square root of variance.
So why use standard deviation instead of variance?
Because variance gives you squared units. If your data is in dollars, variance is in dollars squared. That's not useful for interpretation. Standard deviation brings the units back to the original measurement, making it interpretable.
In the example above, variance was 2.96 (dollars squared), but standard deviation is 1.72 (dollars). You can actually say "values typically deviate from the mean by about $1.72" instead of making no sense.
Quick Reference: When to Use Which Formula
| Situation | Formula | Denominator |
|---|---|---|
| Analyzing entire population | Population σ | N |
| Sample from larger population | Sample s | n-1 |
| Quality control (all products) | Population | N |
| Research study (subset of people) | Sample | n-1 |
| Known mean, all data points | Population | N |
| Estimated mean from sample | Sample | n-1 |
Common Mistakes to Avoid
- Using population formula for samples: This under-corrects and gives a biased estimate. Use n-1 for samples.
- Forgetting to square the deviations: Deviations sum to zero by definition. Squaring makes them positive.
- Confusing standard deviation with standard error: Standard error is SD divided by √n. Different thing entirely.
- Ignoring outliers: Standard deviation is sensitive to extreme values. A single outlier can inflate it dramatically.
How to Calculate Standard Deviation in Excel or Google Sheets
You don't need to do this by hand every time. Spreadsheets have built-in functions.
For a Sample:
Use =STDEV.S(range)
For a Population:
Use =STDEV.P(range)
That's the simplest way. Just plug in your data range and you get the answer instantly.
Standard Deviation in the Real World
Where does this actually show up?
- Finance: Measuring investment volatility. A high standard deviation means the investment's returns swing wildly.
- Quality control: Checking if product dimensions stay consistent. Low SD means consistent quality.
- Education: Comparing how students perform. High SD means mixed abilities in the group.
- Medicine: Clinical trials measure how consistently patients respond to treatment.
- Sports: Evaluating player consistency. A basketball player's scoring SD shows whether they're reliable or unpredictable.
The 68-95-99.7 Rule (Empirical Rule)
For normally distributed data, standard deviation has predictable relationships to the mean:
- 68% of data falls within 1 SD of the mean
- 95% of data falls within 2 SD of the mean
- 99.7% of data falls within 3 SD of the mean
If exam scores average at 75 with an SD of 10, about 68% of students scored between 65 and 85. About 95% scored between 55 and 95.
This only works for roughly normal distributions. Skewed data doesn't follow this rule.
Getting Started: Your Action Plan
- Identify your data type. Population or sample? This determines your formula.
- Calculate the mean. Sum all values, divide by count.
- Subtract the mean from each value. This gives you deviations.
- Square each deviation. No negatives to deal with.
- Sum all squared deviations. Add them up.
- Divide by N or n-1. Get the variance.
- Take the square root. That's your standard deviation.
Do this manually for a few datasets until you understand what's happening. Then use spreadsheet functions for anything real.
Bottom Line
Standard deviation tells you how spread out your data is. The formula looks intimidating but it's just six straightforward steps. Population data uses N in the denominator. Samples use n-1. Get this right and you're set for most statistical work.