Markov Chains Explained- Khan Academy Tutorial
What Are Markov Chains?
A Markov chain is a mathematical system that tracks how you move from one state to another. The core idea is ruthlessly simple: what happens next depends only on where you are right now. Not the past. Not the future. Just the present.
That's called the Markov property. Named after Russian mathematician Andrey Markov, who invented the concept in 1906 to analyze vowel patterns in Pushkin's poetry. Yeah, poetry. The same math that predicts weather patterns started with literature.
Markov chains are everywhere now. Your phone's predictive text. Stock market models. Google PageRank. That Spotify shuffle that feels oddly random but isn't. If you've used any of these, you've already encountered the math.
The Basic Mechanics
Every Markov chain has three components you need to understand:
- States — The different conditions the system can be in. Weather can be sunny, rainy, or cloudy. A webpage can be any page on the internet.
- Transition probabilities — The likelihood of moving from one state to another. If it's sunny today, maybe there's a 70% chance it stays sunny tomorrow.
- Transition matrix — A table that organizes all these probabilities in one place.
That's it. Those three things define the entire system.
The Transition Matrix
Here's what a transition matrix looks like for a simple weather model:
| Sunny | Cloudy | Rainy | |
|---|---|---|---|
| Sunny | 0.7 | 0.2 | 0.1 |
| Cloudy | 0.3 | 0.4 | 0.3 |
| Rainy | 0.2 | 0.3 | 0.5 |
Reading this table: if today is sunny, tomorrow has a 70% chance of being sunny, 20% cloudy, 10% rainy. The rows always sum to 1.0 because something has to happen next.
The math is just multiplication and addition. Take your current state as a row vector, multiply by the transition matrix, and you get the probability distribution for the next state.
Why Khan Academy Works for Learning This
Most resources either oversimplify or drown you in notation. Khan Academy hits a middle ground that's actually useful.
Sal Khan teaches by building intuition first. He'll show you a problem, let you sit with it, then walk through the solution step by step. No prerequisites beyond basic probability and matrix multiplication.
The exercises are the real value. You can read about Markov chains all day, but you won't understand them until you've multiplied a few hundred matrices. Khan Academy forces you to do the work.
The content is free. That matters when you're teaching yourself something this abstract. No paywall between you and understanding.
What Khan Academy Covers
- Introduction to Markov chains and the memoryless property
- Transition matrices and how to read them
- Calculating long-term behavior and steady states
- Absorbing states and their properties
- Real-world applications and examples
Getting Started With the Tutorial
Here's what to do:
- Start with the probability basics — Khan Academy has a full statistics course. If you can't multiply fractions or understand "P(A and B)", stop here and fix that first.
- Find the Markov chain section — Search "Markov chains" in the Khan Academy search. It's under the statistics and probability course, usually nested in the random processes unit.
- Watch once without notes — Get the general idea. Don't try to understand everything.
- Watch again with paper — Work through every example Sal writes. Pause if you need to.
- Do every exercise — Until you can do them without checking the hints.
Most people finish the core tutorial in 3-5 hours if they already know basic probability. Budget more time if you're rusty.
What to Do After Khan Academy
The Khan Academy tutorial gives you the foundation. After that, you need to see more examples and different contexts.
| Resource | Best For | Cost |
|---|---|---|
| Khan Academy | Building intuition, first exposure | Free |
| 3Blue1Brown (YouTube) | Visual understanding, "aha" moments | Free |
| Grinstead & Snell (Book) | Rigorous theory, exercises | Free PDF |
| MIT OpenCourseWare | University-level depth | Free |
3Blue1Brown's video on Markov chains is particularly good for the geometric intuition. He animates the state transitions, which makes the abstract concepts concrete.
If you want to code Markov chains, grab any Python tutorial and implement a simple text generator. Take a book, calculate transition probabilities between letters, and generate fake text. It's a classic exercise that actually works.
Common Mistakes
- Forgetting the rows sum to 1 — Your probabilities must add up. If they don't, you made an error.
- Confusing states with transitions — States are where you are. Transitions are how you move.
- Ignoring the initial state — Your starting point affects the first few steps, even if long-term behavior converges.
- Skipping the exercises — Reading isn't learning. You have to do the problems.
The Bottom Line
Markov chains are not complicated. The notation looks scary, but the underlying concept is something a high school student can grasp. The math is just multiplication and addition.
Khan Academy is a solid starting point. The tutorial won't make you an expert, but it'll give you enough to understand what experts are talking about. From there, you can decide if you want to go deeper.
Start now. The best time to learn this was when you first encountered it in predictive text. The second best time is right now.