Discrete Random Variable- Probability and Statistics Guide
What Is a Discrete Random Variable?
A discrete random variable takes on specific, separate values. Not infinite decimals. Not any value in between. Just distinct, countable outcomes.
Think of flipping a coin three times. The number of heads you get can only be 0, 1, 2, or 3. Those are discrete values. Now think of measuring someone's height with perfect precision. That can be any value within a range. That's continuous, not discrete.
The distinction matters. The math you use depends on which type you're dealing with. Mix them up and your analysis falls apart.
Probability Mass Functions (PMF)
Every discrete random variable has a probability mass function. This tells you the probability of each possible outcome.
For a fair six-sided die, the PMF looks like this:
P(X = x) = 1/6 for x = 1, 2, 3, 4, 5, 6
That's it. Each value has a probability of exactly 1/6. The sum of all probabilities equals 1. That's non-negotiable.
The Rules a PMF Must Follow
- Every probability is between 0 and 1
- The sum of all probabilities equals exactly 1
- Each specific outcome has its own probability
If your "distribution" violates these rules, it's not a probability distribution. End of discussion.
Expected Value: What You Actually Expect to Get
The expected value is the long-run average if you repeated an experiment infinite times. People confuse this with "what you'll probably get on your next try." That's wrong.
The formula is straightforward:
E[X] = Σ x · P(X = x)
Multiply each outcome by its probability, then sum everything up.
Example: A game costs $5 to play. You roll a die. If you roll a 6, you win $25. Anything else, you win nothing.
E[X] = (1/6 × $20) + (5/6 × -$5) = $3.33 - $4.17 = -$0.84
The expected value is negative. This game is a losing proposition. Run the math before you play.
Variance and Standard Deviation
Expected value tells you the center. Variance tells you the spread. How much does the outcome typically deviate from the mean?
Var(X) = E[X²] - (E[X])²
Calculate E[X²] by squaring each outcome before multiplying by its probability. Then subtract the square of the expected value.
Standard deviation is just the square root of variance. It puts the measure back in the original units, which makes interpretation easier.
Common Discrete Distributions You Need to Know
Some distributions show up constantly. Memorize these. You'll encounter them in exams, real data, and job interviews.
Binomial Distribution
You have n independent trials, each with probability p of success. Count the number of successes.
Conditions:
- Fixed number of trials
- Two possible outcomes per trial
- Constant probability of success
- Trials are independent
Example: You survey 100 people. Each has a 30% probability of supporting a policy. How many supporters will you get?
Mean = np
Variance = np(1-p)
Poisson Distribution
Models the number of events happening in a fixed interval when events occur independently and the average rate is known.
Good for: number of emails per hour, customers arriving at a store, defects per unit.
The parameter λ represents both the mean and the variance. That's unusual. Most distributions don't have this property.
Geometric Distribution
Counts how many trials until the first success. You're flipping coins until you get heads. How many flips until that happens?
The expected number of trials is 1/p. If p = 0.2, you expect 5 flips on average.
Key property: memoryless. If you've already flipped 10 times without heads, your probability of getting heads on the next flip is still p. The past doesn't matter.
Distribution Comparison
| Distribution | What It Counts | Parameters | Mean | Variance |
|---|---|---|---|---|
| Binomial | Successes in n trials | n, p | np | np(1-p) |
| Poisson | Events in interval | λ (rate) | λ | λ |
| Geometric | Trials until first success | p | 1/p | (1-p)/p² |
| Hypergeometric | Successes without replacement | N, K, n | n(K/N) | n(K/N)(1-K/N)((N-n)/(N-1)) |
How to Work With Discrete Random Variables
Step 1: Identify the Distribution
Before calculating anything, figure out which distribution fits your situation. Ask yourself:
- Are there two outcomes or many?
- Is the probability constant across trials?
- Are trials independent or dependent?
- Are you counting events in time/space or trials until an outcome?
Step 2: Write Out the PMF
List every possible value and its probability. Verify the sum equals 1. This catches mistakes early.
Step 3: Calculate Expected Value and Variance
Use the formulas. Show your work. One arithmetic error destroys everything.
Step 4: Interpret the Results
What does this actually mean for your situation? A variance of 4.5 might be meaningless on its own. Convert it to standard deviation. Ask if the spread makes sense given the context.
Common Mistakes to Avoid
Confusing discrete and continuous. Don't use probability density functions for discrete variables. Don't sum probabilities like they're areas under a curve.
Forgetting the sum equals 1. Always verify. If it doesn't equal 1, something is wrong.
Misapplying the memoryless property. Only geometric and exponential distributions have it. Binomial doesn't.
Using expected value as a prediction. E[X] doesn't tell you what you'll see in a single trial. It tells you the long-run average. One roll of a die gives you 1 through 6, not 3.5.
When to Use Each Distribution
Binomial when you have a fixed number of independent yes/no trials.
Poisson when you're counting events in time or space with a known average rate.
Geometric when you're waiting for the first success.
Hypergeometric when you're sampling without replacement from a finite population.
Pick wrong and your answer will be wrong. There's no partial credit for using the wrong distribution.