Hardy-Weinberg Equilibrium- Population Genetics

What Is Hardy-Weinberg Equilibrium?

Hardy-Weinberg Equilibrium (HWE) is a mathematical model that describes what happens to allele and genotype frequencies in an idealized population where evolution is not occurring. Think of it as a baseline — a null hypothesis against which you measure real populations.

The model states that allele and genotype frequencies will remain constant from generation to generation if specific conditions are met. No mutation, no migration, infinite population size, random mating, and no natural selection. In the real world, these conditions never exist. That's exactly why HWE is useful.

You use it to detect evolutionary forces at work. If observed genotype frequencies match expected frequencies from HWE, the population appears to be in equilibrium. If they don't match, something is causing evolution — and you get to figure out what.

The Hardy-Weinberg Equation

The equation sits at the center of everything:

p² + 2pq + q² = 1

Where:

The math is straightforward. If you know allele frequencies, you can predict genotype frequencies. If you know genotype frequencies, you can calculate allele frequencies.

Deriving Allele Frequencies from Genotype Data

To find p and q from genotype counts:

p = (2 × AA + Aa) / (2 × total individuals)

q = (2 × aa + Aa) / (2 × total individuals)

Or simpler: count the total number of each allele across all individuals, then divide by the total number of allele copies (which is 2 times the number of individuals).

The Five Assumptions

HWE only holds when all five conditions are satisfied. Every violation points to an evolutionary mechanism.

1. No Mutation

New alleles aren't arising and existing alleles aren't changing. If mutations occur, allele frequencies shift.

2. No Migration

No individuals are moving into or out of the population. Migration introduces new alleles or removes existing ones.

3. Infinite Population Size

Also called no genetic drift. In small populations, random sampling causes allele frequencies to bounce around generation to generation. Large populations minimize this effect.

4. Random Mating

Individuals mate without regard to genotype. If mating is assortative (positive or negative), genotype frequencies deviate from expectations.

5. No Natural Selection

All genotypes have equal survival and reproductive success. Any selection pressure changes allele frequencies.

Why Scientists Use Hardy-Weinberg

Population geneticists and evolutionary biologists rely on HWE for several practical reasons:

Testing for Hardy-Weinberg Equilibrium

You compare observed genotype counts against expected counts under HWE. The chi-square (χ²) test is the standard approach.

The Chi-Square Test

Calculate expected counts using p², 2pq, and q², then compute:

χ² = Σ((observed - expected)² / expected)

Compare your χ² value to the critical value at your chosen significance level (usually p < 0.05) with 1 degree of freedom (for a biallelic locus).

If your calculated χ² exceeds the critical value (3.84 at p = 0.05), you reject the null hypothesis. The population is not in Hardy-Weinberg equilibrium.

Hardy-Weinberg in Human Genetics

HWE applies directly to understanding human genetic variation. Consider a trait like cystic fibrosis, caused by mutations in the CFTR gene. You can use HWE to estimate carrier frequency in a population.

If the disease frequency (q²) is 1 in 4,000:

This is why population genetics matters for genetic counseling. HWE gives you the math to estimate risk.

Common Deviations and What They Signal

When genotype frequencies don't match HWE predictions, something specific is usually happening.

Excess Homozygotes

May indicate inbreeding or population stratification. Related individuals share alleles more often than expected under random mating.

Excess Heterozygotes

Often a sign of heterozygote advantage — where individuals with one dominant and one recessive allele have higher fitness than either homozygote. Sickle cell anemia in malaria-endemic regions is a textbook example.

Allele Frequency Changes Over Time

Could indicate selection, genetic drift in small populations, or migration from populations with different allele frequencies.

Hardy-Weinberg Calculator Tools

You don't need to do these calculations by hand. Several tools handle the math quickly.

Tool Best For Limitations
Online HWE calculators Quick single-locus tests Limited sample sizes
Population genetics software (PLINK, Arlequin) Large genomic datasets Steep learning curve
R packages (HardyWeinberg) Custom analyses, visualizations Requires R knowledge
Python (scipy, numpy) Automated pipelines Programming required

For most undergraduate and early graduate work, an online calculator handles basic HWE testing fine. For genome-wide studies, you need dedicated software.

Getting Started: Testing Your Own Data

Here's the practical workflow:

  1. Collect genotype data for your population at the locus of interest. You need counts for each of the three genotypes.
  2. Calculate allele frequencies from your genotype counts. Count each allele, divide by total allele copies.
  3. Calculate expected genotype frequencies using p², 2pq, and q². Multiply by total sample size to get expected counts.
  4. Run a chi-square test comparing observed vs. expected counts.
  5. Interpret results. If χ² > 3.84 (p < 0.05), the population is not in equilibrium.

Pro tip: when sample sizes are small (under 25-30 individuals), Fisher's exact test is more appropriate than chi-square. It handles the probability calculations directly without relying on approximations that break down with low counts.

The Bottom Line

Hardy-Weinberg Equilibrium is a foundational concept in population genetics. It gives you a null model to detect evolutionary forces. You calculate expected genotype frequencies from allele frequencies, compare them to what you observe, and use the difference to identify selection, drift, migration, or mating patterns.

The math is simple. The interpretation requires understanding your study population and knowing which evolutionary forces are biologically plausible in that system. Numbers without context don't tell you much.