Two Way Table Statistics- Complete Guide

What Is a Two Way Table?

A two way table is a grid that shows how two categorical variables relate to each other. You count how many observations fall into each combination of categories, then arrange those counts in rows and columns.

That's it. No magic involved.

Statisticians also call these contingency tables or crosstabs. The names are interchangeable. Use whichever your audience prefers.

Why Bother With Two Way Tables?

Because they answer a simple question: "Does variable A relate to variable B?"

Example: You survey 500 people about smoking and lung disease. A two way table shows you exactly how many smokers have lung disease, how many smokers don't, how many non-smokers have lung disease, and so on. You can see patterns instantly.

Raw data dumps don't show you these relationships. Two way tables do.

The Basic Structure

Every two way table has the same anatomy:

Rows represent one variable's categories
Columns represent the other variable's categories
Cell counts show how many observations hit each combination
Row totals show the total for each row
Column totals show the total for each column
Grand total is the total number of observations

Marginal Distributions

The row and column totals sit in the margins of the table. They show you the distribution of each variable by itself, ignoring the other variable. These are called marginal distributions.

Conditional Distributions

Conditional distributions show you one variable's pattern, given a specific value of the other variable. You calculate these by dividing cell counts by row or column totals.

If you want to know "what percentage of smokers have lung disease," you divide the smoker-lung disease count by total smokers. That's a conditional distribution.

Reading a Two Way Table: A Real Example

Here's a table showing 1,000 people's preference for coffee or tea, broken down by age group:

	Coffee	Tea	Row Total
Under 30	180	70	250
30-50	150	150	300
Over 50	100	350	450
Column Total	430	570	1000

From this table, you can immediately see that younger people prefer coffee while older people prefer tea. The relationship is visible without running any test.

Types of Two Way Tables

Frequency Tables

These show raw counts. The table above is a frequency table.

Relative Frequency Tables

These show proportions or percentages. You divide each cell by the grand total.

	Coffee	Tea	Row Total
Under 30	18.0%	7.0%	25.0%
30-50	15.0%	15.0%	30.0%
Over 50	10.0%	35.0%	45.0%
Column Total	43.0%	57.0%	100%

Conditional Probability Tables

These show percentages within rows or columns. Row percentages answer "within each age group, what drink do they prefer?" Column percentages answer "within each drink preference, what age groups are represented?"

How to Create a Two Way Table

Here's the practical process:

Identify your two variables. Both must be categorical.
List the categories for each variable. Keep them mutually exclusive.
Collect or organize your data. Each observation goes into exactly one cell.
Count observations per cell. Tally manually or use software.
Calculate totals. Row totals, column totals, and grand total.
Add marginal distributions if needed. For context.

Doing It in Excel

Excel's PivotTable feature handles this well. Select your data, insert a PivotTable, drag one variable to rows, the other to columns, and drag your count variable to values. Done.

Doing It in Python

import pandas as pd

# Create a sample dataset
data = {'Age': ['Under 30', 'Under 30', '30-50', '30-50', 'Over 50', 'Over 50'],
        'Drink': ['Coffee', 'Tea', 'Coffee', 'Tea', 'Coffee', 'Tea'],
        'Count': [180, 70, 150, 150, 100, 350]}

df = pd.DataFrame(data)

# Create the crosstab
table = pd.pivot_table(df, values='Count', index='Age', columns='Drink', aggfunc='sum', fill_value=0)
print(table)

Statistical Tests for Two Way Tables

Tables show patterns. Tests tell you if those patterns are real or just random noise.

Chi-Square Test

The most common test for two way tables. It compares observed counts to counts you'd expect if the variables had no relationship.

You calculate expected counts using this formula:

Expected = (Row Total × Column Total) / Grand Total

Then you calculate the chi-square statistic and compare it to a critical value. If your chi-square is large enough, you reject the null hypothesis of independence.

Requirements: expected counts should be 5 or more in each cell. If not, use Fisher's exact test instead.

Cramér's V

Chi-square tells you if a relationship exists. Cramér's V tells you how strong it is. It ranges from 0 (no relationship) to 1 (perfect relationship).

Interpretation: V around 0.1 is weak, 0.3 is moderate, 0.5 or higher is strong.

What Two Way Tables Cannot Tell You

These tables show association, not causation. Just because two variables are related doesn't mean one causes the other.

Example: A table might show that people who eat cereal score higher on tests. That doesn't mean cereal makes you smarter. A third variable (maybe income, or family habits) could explain both.

Two way tables are descriptive tools. For causation, you need experimental design, not just cross-tabulation.

Common Mistakes to Avoid

Ignoring sample size. Small samples produce unstable percentages. A 60-40 split in a group of 10 people means nothing.
Mixing up row vs. column percentages. Make sure you know which one answers your question.
Forgetting to check expected counts. Chi-square breaks down with sparse data.
Reading down columns when you should read across rows. Know what comparison you're actually making.

When to Use Two Way Tables

Two way tables work best when:

Both variables are categorical (or discrete with few values)
You want to show the actual distribution of responses
You need to explain findings to non-statisticians
You're preparing data for a chi-square test

They don't work well when variables are continuous. For age as a continuous variable, you'd bin it into groups first. For income, same thing.

The Bottom Line

Two way tables are a foundational tool. They organize categorical data, reveal patterns, and set you up for formal statistical testing. Learn to read them, build them, and know their limits.

That's all you need.