Sampling Distribution- Clear Definition and Examples
What Is a Sampling Distribution?
A sampling distribution is the probability distribution of a statistic calculated from multiple random samples drawn from the same population. It's not about individual data points—it's about what happens when you take many samples and calculate the same statistic each time.
Most students confuse this with a sample distribution. Here's the difference:
- Sample distribution: the spread of values in one single sample
- Sampling distribution: the distribution of statistics (like means) across many samples
You never actually observe a sampling distribution in practice. You theoretically construct it to understand how your estimates behave.
Why This Matters
If you collect one sample and calculate a mean of 47, you have no idea how far that is from the true population mean. The sampling distribution tells you how much your statistic would vary if you repeated the sampling process infinitely.
This variation is predictable. That's the whole point.
The Core Components
Standard Error
The standard deviation of a sampling distribution is called the standard error (SE). For the sample mean:
SE = σ / √n
Where σ is the population standard deviation and n is the sample size. This formula tells you that variability decreases as your sample size increases. Larger samples give more precise estimates.
If you don't know σ, you substitute with the sample standard deviation s:
SE = s / √n
Shape of the Distribution
The shape depends on what statistic you're working with and your sample size. For the sample mean, the Central Limit Theorem kicks in:
- If the population is normal, the sampling distribution of the mean is normal regardless of sample size
- If the population is skewed, you need n ≥ 30 for the sampling distribution to approximate normal
Real Examples
Example 1: Rolling Dice
Imagine a fair six-sided die. The population mean is 3.5.
Take samples of size 10. Roll 10 dice, record the mean. Repeat this 1000 times. The distribution of those 1000 sample means is your sampling distribution.
The mean of this sampling distribution will be approximately 3.5. The standard error will be σ/√10 ≈ 1.71/3.16 ≈ 0.54.
Example 2: Estimating Average Height
Suppose adult female height has μ = 65 inches and σ = 3.5 inches.
If you sample n = 50 women:
- Sampling distribution mean = 65 inches (same as population)
- SE = 3.5 / √50 = 3.5 / 7.07 = 0.495 inches
So the standard deviation of your sample means across many samples is less than half an inch. That's why larger samples give tighter estimates.
Comparing Sampling Distributions
Here's how standard error changes with sample size for a population with σ = 10:
| Sample Size (n) | Standard Error |
|---|---|
| 25 | 2.0 |
| 100 | 1.0 |
| 400 | 0.5 |
| 1000 | 0.32 |
Notice: to halve the SE, you need to quadruple the sample size. This is why diminishing returns kick in hard.
How to Work With Sampling Distributions
Finding Probabilities
Once you know your sampling distribution is approximately normal (via CLT or population normality), you can calculate z-scores:
z = (x̄ - μ) / (σ/√n)
Example: You want to know the probability that a sample mean exceeds 52 when μ = 50, σ = 15, and n = 36.
SE = 15/6 = 2.5
z = (52 - 50) / 2.5 = 0.8
P(x̄ > 52) = P(z > 0.8) = 0.2119
About 21% of samples will have a mean above 52.
Common Mistakes
- Confusing σ with SE: σ describes population variability. SE describes how sample statistics vary.
- Forgetting sample size matters: A sample of 5 behaves very differently from a sample of 500.
- Assuming normality without checking: If your population is heavily skewed and n is small, CLT doesn't rescue you.
When This Actually Comes Up
You'll encounter sampling distributions in:
- Hypothesis testing (calculating p-values)
- Confidence interval construction
- Understanding margin of error in polls
- Any situation where you're generalizing from sample to population
Polls are a practical example. When a poll says "47% ± 3%", that ±3 is derived from the standard error of the sampling distribution of proportions.
Getting Started: The Checklist
When working with any problem involving sampling distributions:
- Identify the population parameter you care about (μ, p, σ)
- Determine your sample size n
- Calculate SE using the appropriate formula
- Check if you can assume normality (CLT conditions)
- Convert to z-score if needed: z = (statistic - parameter) / SE
- Use z-table or calculator to find your probability
The math is straightforward once you stop overthinking it. SE tells you precision. CLT tells you shape. From there, standard normal calculations handle the rest.