Standard Deviation Equation- Statistical Formula and Calculation
What Standard Deviation Actually Is
Standard deviation measures how spread out numbers are from their average. That's it. Nothing fancy. If your data points cluster tightly around the mean, you get a small standard deviation. If they're scattered all over, you get a large one.
Most people confuse it with variance. Variance is the average of squared differences from the mean. Standard deviation is just the square root of variance. Why bother taking the square root? Because variance gives you squared units. If you're measuring dollars, variance gives you square dollars. Standard deviation puts you back in dollars.
You'll see this equation everywhere in statistics, finance, science, and anywhere someone needs to know if their data is consistent or all over the place.
The Standard Deviation Equation
Population Standard Deviation
Use this when you have every single data point in your entire group:
σ = √[Σ(xi - μ)² / N]
Where:
- σ = population standard deviation
- xi = each individual data point
- μ = population mean (average of all data)
- N = total number of data points
- Σ = sum of all values
Sample Standard Deviation
Use this when your data is just a sample of a larger population. This is what you'll use most often in real research:
s = √[Σ(xi - x̄)² / (n-1)]
Where:
- s = sample standard deviation
- xi = each data point in your sample
- x̄ = sample mean
- n = sample size
Notice the (n-1) in the denominator. This is Bessel's correction. It corrects the bias that happens when estimating population parameters from a sample. If you use n instead, you'll underestimate the true variability.
Population vs Sample: How to Choose
This trips up a lot of people. Here's the simple rule:
- You're measuring every single member of a group → population standard deviation (σ)
- You're working with a sample and trying to estimate something larger → sample standard deviation (s)
Example: You survey all 50 employees at one company about their commute time. That's your entire population. Use σ.
Example: You survey 500 people in a city and want to estimate commute times for the whole metro area. That's a sample. Use s.
In practice, sample standard deviation is way more common. Most research involves sampling.
Step-by-Step Calculation
Let's work through an example. Test scores: 70, 75, 80, 85, 90
Step 1: Find the Mean
Add all values and divide by count:
(70 + 75 + 80 + 85 + 90) / 5 = 400 / 5 = 80
Step 2: Find Each Deviation from the Mean
Subtract the mean from each value:
- 70 - 80 = -10
- 75 - 80 = -5
- 80 - 80 = 0
- 85 - 80 = 5
- 90 - 80 = 10
Step 3: Square Each Deviation
- (-10)² = 100
- (-5)² = 25
- (0)² = 0
- (5)² = 25
- (10)² = 100
Step 4: Sum the Squared Deviations
100 + 25 + 0 + 25 + 100 = 250
Step 5: Divide by N or (n-1)
For population: 250 / 5 = 50
For sample: 250 / 4 = 62.5
Step 6: Take the Square Root
Population: √50 ≈ 7.07
Sample: √62.5 ≈ 7.91
A standard deviation of about 7-8 on a test where scores range from 70-90 tells you the scores are fairly clustered. If that same mean came from scores of 20, 80, 140, 80, 80, the standard deviation would be huge and you'd know something weird was happening.
Quick Comparison
| Type | When to Use | Denominator |
|---|---|---|
| Population (σ) | You have every data point in the group | N |
| Sample (s) | You're working with a subset to estimate a larger population | n - 1 |
What Standard Deviation Tells You
In a normal distribution (bell curve), roughly:
- 68% of data falls within 1 standard deviation of the mean
- 95% falls within 2 standard deviations
- 99.7% falls within 3 standard deviations
This is the empirical rule, also called the 68-95-99.7 rule. It's useful for spotting outliers. If something falls beyond 3 standard deviations from the mean, it's unusual. Worth investigating.
Standard deviation is also the square root of variance, which matters for something called the coefficient of variation: (SD / Mean) × 100. This lets you compare variability across datasets with different scales. A CV of 5% means the data is tightly packed relative to its average. A CV of 50% means it's all over the place.
Common Mistakes
- Using population formula on samples. You'll underestimate variability. Use (n-1) unless you're certain you have the whole population.
- Confusing standard deviation with standard error. Standard error is SD / √n. It measures how much the sample mean would vary if you took multiple samples. Different thing entirely.
- Ignoring outliers. One extreme value inflates standard deviation dramatically because deviations get squared. Check your data before trusting the number.
- Assuming normal distribution. The 68-95-99.7 rule only applies to normal distributions. Skewed data doesn't follow this pattern.
How to Calculate It
You can do this by hand for small datasets, but for anything real, use a tool.
In Excel or Google Sheets
Use =STDEV.P() for population standard deviation.
Use =STDEV.S() for sample standard deviation.
Example: If your data is in cells A1 through A20, type =STDEV.S(A1:A20) in any cell.
In Python
import statistics
data = [70, 75, 80, 85, 90]
# Sample standard deviation
statistics.stdev(data)
# Population standard deviation
statistics.pstdev(data)
In R
data <- c(70, 75, 80, 85, 90)
# Sample standard deviation
sd(data)
# Population standard deviation
sqrt(var(data) * (length(data)-1) / length(data))
Online Calculators
If you just need a quick answer, type "standard deviation calculator" into Google. Input your numbers and get the result in seconds. Fine for one-off calculations. Don't rely on this if you're doing repeated analyses.
When Standard Deviation Matters
Standard deviation shows up in:
- Quality control — manufacturing uses it to see if products stay within specs
- Finance — volatility of returns is measured in standard deviation. Higher SD = riskier investment
- Research — reported alongside means to show how reliable your data is
- Grading — some teachers use standard deviation to curve scores
- A/B testing — helps determine if differences between groups are statistically meaningful
Anywhere someone asks "how consistent is this data?" standard deviation is the answer.
The Bottom Line
Standard deviation is the most common way to measure spread in a dataset. The formula is straightforward: find deviations from the mean, square them, average them, take the square root. Population formula uses N. Sample formula uses n-1. Pick the right one based on whether you're measuring a whole group or a sample.
If your data is normal, about two-thirds of it will sit within one standard deviation of the mean. Use this to quickly spot whether something is typical or an outlier.