Cumulative Frequency- Statistical Analysis Guide
What Is Cumulative Frequency?
Cumulative frequency is the running total of frequencies as you move through data in order. It shows how many observations fall at or below a certain value. That's it. Nothing complicated.
You calculate it by adding each frequency to the total of all frequencies that came before it. Start at zero, then keep adding.
Why Cumulative Frequency Matters
You need cumulative frequency when you want to answer questions like:
- What percentage of values are below X?
- What is the 75th percentile?
- How many students scored below 70 on the test?
Raw frequency tables tell you how many fall in each class. Cumulative frequency tells you how many fall at or under a threshold. This is the difference that matters.
Building a Cumulative Frequency Table
Here's the process. Say you have test scores for 40 students:
Step 1: Organize Your Data
Create a frequency distribution table first. Group your data into classes if it's continuous or has many values.
Step 2: Add a Cumulative Frequency Column
Start with the first frequency. Then each subsequent row gets the previous cumulative frequency plus the current frequency.
| Score Range | Frequency | Cumulative Frequency |
|---|---|---|
| 0–20 | 3 | 3 |
| 21–40 | 7 | 10 |
| 41–60 | 12 | 22 |
| 61–80 | 10 | 32 |
| 81–100 | 8 | 40 |
Check your work: the final cumulative frequency should equal your total number of observations. 40 students. It checks out.
Step 3: Read Your Table
From this table, you can instantly see that 32 students scored 80 or below. You can also calculate that 22 students scored 60 or below. No extra math required.
The Cumulative Frequency Graph (Ogive)
An ogive is just a line graph plotting cumulative frequency against the upper class boundary. It gives you a visual representation of the distribution.
How to Plot an Ogive
- Use the upper boundary of each class on the x-axis
- Use the cumulative frequency on the y-axis
- Plot points at each upper boundary
- Connect the points with straight lines
- The graph always starts at (lower boundary of first class, 0)
The ogive rises from left to right. Steeper sections mean more data points clustered in that range. Flatter sections mean sparse distribution.
Finding the Median from Cumulative Frequency
The median is the value at the 50th percentile. Here's how to find it:
Method 1: From a Table
Divide your total frequency by 2. That's your median position.
Using the table above: 40 / 2 = 20
Find the class where cumulative frequency first exceeds 20. That's the 41–60 class. The median falls somewhere in this class.
Method 2: From an Ogive
Draw a horizontal line from 50% on the y-axis to the ogive curve. Drop a vertical line from that intersection to the x-axis. Read the value.
This gives you an approximate median directly from the graph. Quick and dirty, but useful.
Finding Quartiles and Percentiles
The same principle applies to any percentile. You just change the percentage.
- First Quartile (Q1): 25% position = n/4
- Second Quartile (Q2): 50% position = n/2 (this is the median)
- Third Quartile (Q3): 75% position = 3n/4
For our 40-student example:
- Q1 position = 40/4 = 10th student
- Q2 position = 40/2 = 20th student
- Q3 position = 3 × 40/4 = 30th student
Use your cumulative frequency table to locate which class each quartile falls into.
Interquartile Range (IQR)
The IQR is Q3 minus Q1. It tells you where the middle 50% of your data sits. It's less affected by outliers than the full range.
Calculate it: find Q1, find Q3, subtract. That's your IQR.
Common Mistakes to Avoid
- Using the wrong boundary: Always use upper class boundaries for plotting ogives, not midpoints
- Forgetting to start at zero: The ogive must start at the lower boundary of the first class with cumulative frequency 0
- Mixing up "less than" and "more than": Less than ogives read left to right. More than ogives read right to left
- Not checking the final total: Your last cumulative frequency must equal your total observations
Less Than vs. More Than Ogives
There are two types of ogives:
- Less than ogive: Plots cumulative frequency using upper class boundaries. Shows how many values fall at or below each point.
- More than ogive: Plots cumulative frequency using lower class boundaries. Shows how many values fall at or above each point.
You can plot both on the same graph. They will intersect at the median. This is a handy verification check.
Practical Example: Analyzing Survey Data
Suppose you're analyzing monthly income data for 100 workers:
| Income ($000s) | Frequency | Cumulative Frequency |
|---|---|---|
| 10–20 | 15 | 15 |
| 20–30 | 28 | 43 |
| 30–40 | 35 | 78 |
| 40–50 | 14 | 92 |
| 50–60 | 8 | 100 |
From this table:
- 43 workers earn $30,000 or less
- 78 workers earn $40,000 or less
- The 75th percentile falls in the $30,000–$40,000 range
No need to calculate anything else. Just read the table.
When to Use Cumulative Frequency
Use it when:
- You have grouped data and need to find percentiles
- You want to compare distributions side by side
- You're looking for outliers beyond the IQR
- You need to estimate values graphically (like reading medians from an ogive)
Don't use it when your data is discrete with few values. A simple frequency table works better for small datasets where every value matters individually.
Quick Reference: Formulas
- Median position: (n + 1) / 2 or n / 2
- Q1 position: n / 4
- Q3 position: 3n / 4
- IQR: Q3 - Q1
Where n = total number of observations.
Final Notes
Cumulative frequency is a tool. It doesn't add complexity for its own sake. You use it because it answers specific questions that raw frequencies cannot. Learn to read the table, learn to plot the graph, learn to extract percentiles. That's all you need.