Sample Standard Deviation- Calculation and Interpretation
What Standard Deviation Actually Is
Standard deviation measures how spread out numbers are from their average. That's it. Nothing fancy.
If your data points cluster tightly around the mean, your standard deviation is small. If they're scattered all over the place, it's large.
Most people overthink this. You don't need a statistics degree to understand it. You need to know one thing: standard deviation tells you how reliable your average is.
Why You Should Care
Here's the bitter truth: an average without context is useless.
Say your investment returns averaged 12% over five years. Sounds good. But if year 5 was -40% and year 4 was +60%, that average means nothing. The standard deviation would've shown you the volatility hiding behind the "good" number.
Standard deviation appears in:
- Finance (measuring investment risk)
- Quality control (is this product consistent?)
- Science (are these results repeatable?)
- Testing (how well did students score overall?)
Population vs. Sample Standard Deviation
Two formulas exist. Use the wrong one, and your numbers are wrong.
Population Standard Deviation (σ)
Use this when you have every single data point in your group. No exceptions, no estimates.
Formula: σ = √(Σ(x - μ)² / N)
Sample Standard Deviation (s)
Use this when you're working with a subset of data and trying to estimate the larger population.
Formula: s = √(Σ(x - x̄)² / (n - 1))
The difference: divide by N for population, divide by (n - 1) for samples. That extra step is called Bessel's correction. It corrects the bias when you only have partial data.
How to Calculate Standard Deviation: Step by Step
Let's say you tracked daily sales over a week: 12, 15, 14, 18, 17, 13, 16
Step 1: Find the Mean
Add everything up and divide by the count.
12 + 15 + 14 + 18 + 17 + 13 + 16 = 105
105 ÷ 7 = 15
Your mean is 15.
Step 2: Find Each Deviation from the Mean
Subtract the mean from each value:
- 12 - 15 = -3
- 15 - 15 = 0
- 14 - 15 = -1
- 18 - 15 = 3
- 17 - 15 = 2
- 13 - 15 = -2
- 16 - 15 = 1
Step 3: Square Each Deviation
Negative numbers cause problems. Squaring makes everything positive:
- (-3)² = 9
- 0² = 0
- (-1)² = 1
- 3² = 9
- 2² = 4
- (-2)² = 4
- 1² = 1
Step 4: Sum the Squared Deviations
9 + 0 + 1 + 9 + 4 + 4 + 1 = 28
Step 5: Divide by N (or n-1)
Since this is your complete dataset: 28 ÷ 7 = 4
Step 6: Take the Square Root
√4 = 2
Your standard deviation is 2. Your average daily sales is 15, with most days falling between 13 and 17.
Interpreting the Numbers
Now you know how to calculate it. Here's where people get lost: what does 2 actually mean?
Within one standard deviation of the mean, you typically find about 68% of your data (assuming a normal distribution).
Within two standard deviations: roughly 95% of your data.
Within three standard deviations: about 99.7%.
Back to our sales example. With a mean of 15 and SD of 2:
- 68% of days had sales between 13 and 17
- 95% of days fell between 11 and 19
If a new day shows sales of 25, that's 5 standard deviations away. That's an outlier. Either something major happened, or your data entry is wrong.
Quick Reference: When to Use Which Formula
| Situation | Formula | Divide By |
|---|---|---|
| Every member of the group surveyed | Population (σ) | N |
| Sample from larger population | Sample (s) | n - 1 |
| Quality testing entire batch | Population (σ) | N |
| Surveying 500 people to estimate city opinions | Sample (s) | n - 1 |
Common Mistakes That Kill Accuracy
Mistake 1: Confusing population and sample SD. Pick the wrong formula, and your estimate is biased. If you're studying a sample, always use n-1.
Mistake 2: Ignoring units. Standard deviation is in the same units as your original data. If you're measuring seconds, your SD is in seconds. Don't pretend otherwise.
Mistake 3: Assuming normal distribution. The 68-95-99.7 rule only works for bell curves. Real-world data often isn't normally distributed. Check your data first.
Mistake 4: Using SD when you need the coefficient of variation. Comparing variability across datasets with different scales? Divide SD by the mean and multiply by 100. That's your CV.
Using Standard Deviation in Spreadsheets
Doing this manually gets old fast. Here's how in Excel or Google Sheets:
- =STDEV.P(range) — population standard deviation
- =STDEV.S(range) — sample standard deviation
That's it. Feed it your data range, and you get the answer in seconds.
When Standard Deviation Misleads You
Standard deviation fails when your data has extreme outliers. One crazy value inflates the SD and makes everything look more variable than it is.
In those cases, use the interquartile range (IQR) instead. It's resistant to outliers and tells you where the middle 50% of your data lives.
Also, a low standard deviation doesn't mean "good." Low variation in manufacturing defects? Good. Low variation in investment returns when markets are crashing? Your portfolio is just as wrecked as everyone else's.