Standard Deviation Explained- Calculation and Use

What Standard Deviation Actually Is

Standard deviation measures how spread out numbers are from their average (mean). That's it. Nothing fancy.

If you have a dataset where every value is close to the mean, your standard deviation will be small. If your values are scattered all over the place, the standard deviation will be large.

This is useful because knowing just the mean tells you nothing about variability. Two datasets can have identical averages but completely different spreads. Standard deviation captures that difference.

Population vs Sample Standard Deviation

You need to know which one you're calculating because the formula differs.

Population standard deviation uses every single data point you have. You divide by N (the total count).

Sample standard deviation estimates the standard deviation of a larger population based on a sample. You divide by N-1 instead of N. This correction (Bessel's correction) gives you a more accurate estimate when you can't measure everyone.

In practice, most statistical software defaults to sample standard deviation unless you explicitly say otherwise.

The Formula (And Why It Looks Weird)

The population standard deviation formula is:

σ = √[Σ(xi - μ)² / N]

Here's what each part means:

You subtract the mean from each value, square the result, add them all up, divide by how many values you have, then take the square root. The squaring step removes negatives (a value 5 below the mean contributes the same as one 5 above the mean). The square root at the end brings the units back to the original scale.

The sample standard deviation formula is identical except you use N-1 instead of N.

Step-by-Step Calculation

Let's calculate the standard deviation for this dataset: 2, 4, 4, 4, 5, 5, 7, 9

Step 1: Find the Mean

Add everything up and divide by how many numbers you have.

(2 + 4 + 4 + 4 + 5 + 5 + 7 + 9) / 8 = 40 / 8 = 5

Step 2: Subtract the Mean from Each Value

2 - 5 = -3
4 - 5 = -1
4 - 5 = -1
4 - 5 = -1
5 - 5 = 0
5 - 5 = 0
7 - 5 = 2
9 - 5 = 4

Step 3: Square Each Result

9, 1, 1, 1, 0, 0, 4, 16

Step 4: Add the Squared Values

9 + 1 + 1 + 1 + 0 + 0 + 4 + 16 = 32

Step 5: Divide by N (or N-1 for a sample)

Assuming this is the full population: 32 / 8 = 4

Step 6: Take the Square Root

√4 = 2

The standard deviation is 2.

What the Numbers Actually Mean

A standard deviation of 2 on its own tells you nothing. You need context.

In a normal distribution (bell curve), roughly 68% of data falls within one standard deviation of the mean. About 95% falls within two standard deviations. About 99.7% falls within three.

Using our example: mean is 5, standard deviation is 2. So roughly 68% of values are between 3 and 7.

This is where standard deviation becomes practical. It lets you predict where most values will land.

Population vs Sample: When to Use Which

Situation Use Divide By
You have every data point (entire population) Population SD N
You're working with a sample to estimate a larger population Sample SD N-1
Quality control on all products made Population SD N
Surveying 500 people to estimate opinions of millions Sample SD N-1

Most business and research situations involve samples. You're measuring a subset to make inferences about a larger group.

Common Mistakes to Avoid

Using population formula on samples. This underestimates variability. Your standard deviation will be artificially small.

Confusing standard deviation with variance. Variance is the standard deviation squared. It's the intermediate step before you take the square root. Standard deviation is in the same units as your data; variance is in squared units.

Forgetting that standard deviation measures spread around the mean. If your data isn't centered around a meaningful average, standard deviation won't tell you much.

Assuming all data follows a normal distribution. Standard deviation assumes a bell curve. If your data is heavily skewed or has outliers, the 68-95-99.7 rule doesn't apply.

How to Calculate in Excel or Google Sheets

You don't need to do this by hand. Excel has built-in functions.

Select your data range, pick the right function, done. The P version divides by N. The S version divides by N-1.

How to Calculate in Python

Python's numpy library makes this trivial:

import numpy as np

data = [2, 4, 4, 4, 5, 5, 7, 9]

np.std(data) # population SD

np.std(data, ddof=1) # sample SD (ddof=1 uses N-1)

When Standard Deviation Is Useless

Standard deviation fails in several common situations:

In these cases, use the interquartile range, median absolute deviation, or other robust measures instead.

Quick Reference

What You Have Formula In Excel
All data points √[Σ(x-μ)² / N] =STDEV.P()
A sample √[Σ(x-x̄)² / (N-1)] =STDEV.S()

Standard deviation is a tool. Like any tool, it works well in the right situation and poorly in the wrong one. Know when to use it.