Two Way Table Statistics- Complete Guide

What Is a Two Way Table?

A two way table is a grid that shows how two categorical variables relate to each other. You count how many observations fall into each combination of categories, then arrange those counts in rows and columns.

That's it. No magic involved.

Statisticians also call these contingency tables or crosstabs. The names are interchangeable. Use whichever your audience prefers.

Why Bother With Two Way Tables?

Because they answer a simple question: "Does variable A relate to variable B?"

Example: You survey 500 people about smoking and lung disease. A two way table shows you exactly how many smokers have lung disease, how many smokers don't, how many non-smokers have lung disease, and so on. You can see patterns instantly.

Raw data dumps don't show you these relationships. Two way tables do.

The Basic Structure

Every two way table has the same anatomy:

Marginal Distributions

The row and column totals sit in the margins of the table. They show you the distribution of each variable by itself, ignoring the other variable. These are called marginal distributions.

Conditional Distributions

Conditional distributions show you one variable's pattern, given a specific value of the other variable. You calculate these by dividing cell counts by row or column totals.

If you want to know "what percentage of smokers have lung disease," you divide the smoker-lung disease count by total smokers. That's a conditional distribution.

Reading a Two Way Table: A Real Example

Here's a table showing 1,000 people's preference for coffee or tea, broken down by age group:

CoffeeTeaRow Total
Under 3018070250
30-50150150300
Over 50100350450
Column Total4305701000

From this table, you can immediately see that younger people prefer coffee while older people prefer tea. The relationship is visible without running any test.

Types of Two Way Tables

Frequency Tables

These show raw counts. The table above is a frequency table.

Relative Frequency Tables

These show proportions or percentages. You divide each cell by the grand total.

CoffeeTeaRow Total
Under 3018.0%7.0%25.0%
30-5015.0%15.0%30.0%
Over 5010.0%35.0%45.0%
Column Total43.0%57.0%100%

Conditional Probability Tables

These show percentages within rows or columns. Row percentages answer "within each age group, what drink do they prefer?" Column percentages answer "within each drink preference, what age groups are represented?"

How to Create a Two Way Table

Here's the practical process:

  1. Identify your two variables. Both must be categorical.
  2. List the categories for each variable. Keep them mutually exclusive.
  3. Collect or organize your data. Each observation goes into exactly one cell.
  4. Count observations per cell. Tally manually or use software.
  5. Calculate totals. Row totals, column totals, and grand total.
  6. Add marginal distributions if needed. For context.

Doing It in Excel

Excel's PivotTable feature handles this well. Select your data, insert a PivotTable, drag one variable to rows, the other to columns, and drag your count variable to values. Done.

Doing It in Python

import pandas as pd

# Create a sample dataset
data = {'Age': ['Under 30', 'Under 30', '30-50', '30-50', 'Over 50', 'Over 50'],
        'Drink': ['Coffee', 'Tea', 'Coffee', 'Tea', 'Coffee', 'Tea'],
        'Count': [180, 70, 150, 150, 100, 350]}

df = pd.DataFrame(data)

# Create the crosstab
table = pd.pivot_table(df, values='Count', index='Age', columns='Drink', aggfunc='sum', fill_value=0)
print(table)

Statistical Tests for Two Way Tables

Tables show patterns. Tests tell you if those patterns are real or just random noise.

Chi-Square Test

The most common test for two way tables. It compares observed counts to counts you'd expect if the variables had no relationship.

You calculate expected counts using this formula:

Expected = (Row Total × Column Total) / Grand Total

Then you calculate the chi-square statistic and compare it to a critical value. If your chi-square is large enough, you reject the null hypothesis of independence.

Requirements: expected counts should be 5 or more in each cell. If not, use Fisher's exact test instead.

Cramér's V

Chi-square tells you if a relationship exists. Cramér's V tells you how strong it is. It ranges from 0 (no relationship) to 1 (perfect relationship).

Interpretation: V around 0.1 is weak, 0.3 is moderate, 0.5 or higher is strong.

What Two Way Tables Cannot Tell You

These tables show association, not causation. Just because two variables are related doesn't mean one causes the other.

Example: A table might show that people who eat cereal score higher on tests. That doesn't mean cereal makes you smarter. A third variable (maybe income, or family habits) could explain both.

Two way tables are descriptive tools. For causation, you need experimental design, not just cross-tabulation.

Common Mistakes to Avoid

When to Use Two Way Tables

Two way tables work best when:

They don't work well when variables are continuous. For age as a continuous variable, you'd bin it into groups first. For income, same thing.

The Bottom Line

Two way tables are a foundational tool. They organize categorical data, reveal patterns, and set you up for formal statistical testing. Learn to read them, build them, and know their limits.

That's all you need.