Scatter Plot Correlation Value- Finding and Interpreting

What Is Correlation in a Scatter Plot?

A scatter plot shows the relationship between two variables. Each point on the chart represents a data pair. Correlation tells you how those points relate to each other.

That's it. You're looking for patterns. Do the points trend upward, downward, or show no pattern at all?

The Three Types of Correlation

Positive correlation: As one variable increases, the other increases. The points trend upward like a hill. Example: hours studied vs. test scores.

Negative correlation: As one variable increases, the other decreases. The points trend downward. Example: age of a car vs. its resale value.

No correlation: The points are scattered randomly. No relationship exists between the variables. Example: shoe size vs. intelligence.

Understanding the Correlation Coefficient (r)

The correlation coefficient, represented as r, gives you a number between -1 and +1. This number tells you the strength and direction of the relationship.

What the Numbers Mean

Quick Reference Table

r ValueInterpretation
+0.8 to +1.0Strong positive
+0.5 to +0.7Moderate positive
+0.2 to +0.4Weak positive
-0.2 to +0.2Negligible/no correlation
-0.4 to -0.2Weak negative
-0.7 to -0.5Moderate negative
-1.0 to -0.8Strong negative

How to Find the Correlation Value

Excel Method

Use the CORREL function. Select two columns of data:

=CORREL(A2:A20, B2:B20)

This returns the r value instantly.

Python Method

import numpy as np
from scipy import stats

x = [1, 2, 3, 4, 5]
y = [2, 4, 5, 4, 5]

r, p_value = stats.pearsonr(x, y)
print(f"Correlation: {r}")

Google Sheets Method

Use =CORREL(range1, range2) — same syntax as Excel.

Manual Calculation (Pearson's r)

The formula looks like this:

r = [Σ(x-x̄)(y-ȳ)] / [√(Σ(x-x̄)²) × √(Σ(y-ȳ)²)]

Don't calculate this by hand unless you enjoy pain. Use software instead.

What to Look for in the Scatter Plot Itself

You can often spot correlation without calculating anything:

Common Mistakes to Avoid

Confusing correlation with causation: Just because two variables correlate doesn't mean one causes the other. Ice cream sales and drowning rates both increase in summer. Ice cream doesn't cause drowning.

Ignoring outliers: One extreme point can dramatically change your r value. Always visualize your data before trusting the number.

Assuming linearity: A low r value doesn't mean no relationship exists. The variables might have a curved relationship. Always plot your data first.

Small sample sizes: With 5 data points, you can get any r value by chance. Bigger samples give more reliable results.

When to Use Scatter Plots

Scatter plots work when both variables are continuous and numerical. Good use cases:

Bad use cases:

R-Squared: The Coefficient of Determination

You might see R² reported alongside r. R² is simply r squared. It tells you what percentage of the variation in Y is explained by X.

Example: If r = 0.8, then R² = 0.64. This means 64% of the variation in the dependent variable is explained by the independent variable. The remaining 36% comes from other factors.

Practical Example

Let's say you're analyzing study time vs. exam scores for 50 students. You collect the data, plot it, and calculate r = 0.72.

Interpretation: There's a moderate-to-strong positive relationship. Students who study more tend to score higher. But 48% of exam score variation comes from factors other than study time — natural ability, exam anxiety, quality of notes, etc.

This is useful. But you can't say studying causes better scores based on this alone.

Getting Started Checklist

Correlation analysis is a starting point, not a conclusion. It tells you a relationship exists. It doesn't tell you why.