Finding Sx and Sy- Statistical Methods Explained

What the Heck Are Sx and Sy?

In statistics, Sx and Sy are just the standard deviations of your x-variable and y-variable respectively. That's it. Nothing fancy.

You might see them called sx (sample standard deviation of x) and sy (sample standard deviation of y) in your textbook or calculator. They're the same thing.

These values show how spread out your data points are around the mean for each variable. A larger Sx means x-values are more scattered. A smaller Sy means y-values cluster tighter around their average.

The Formula Nobody Remembers

Here's the actual formula for calculating Sx:

Sx = √[Σ(xi - x̄)² / (n-1)]

And for Sy:

Sy = √[Σ(yi - ȳ)² / (n-1)]

Where:

The n-1 in the denominator is there because you're working with a sample, not an entire population. This correction (called Bessel's correction) gives you a more accurate estimate.

Step-by-Step: How to Find Sx and Sy

Example Data

Let's say you have 5 data points:

Point x y
1 2 3
2 4 5
3 6 7
4 8 9
5 10 11

Step 1: Calculate the Means

= (2+4+6+8+10) / 5 = 6

ȳ = (3+5+7+9+11) / 5 = 7

Step 2: Find Each Deviation from the Mean

For x-values: subtract 6 from each

For y-values: subtract 7 from each

Step 3: Square the Deviations

For x: 16, 4, 0, 4, 16 → sum = 40

For y: 16, 4, 0, 4, 16 → sum = 40

Step 4: Divide by (n-1)

n = 5, so n-1 = 4

40 / 4 = 10

Step 5: Take the Square Root

Sx = √10 = 3.16

Sy = √10 = 3.16

Your standard deviations are both 3.16. Makes sense here since x and y have the exact same spread.

How to Get Sx and Sy on a Calculator

Doing this by hand is tedious. Here's how to get these values fast.

TI-84 Calculator

  1. Press STAT
  2. Select 1: Edit
  3. Enter your x-values in L1 and y-values in L2
  4. Press STAT again
  5. Go to CALC
  6. Select 1-Var Stats
  7. Enter L1 (for Sx) or L2 (for Sy) and press Enter

The output shows Sx (sample) and σx (population if you need that instead).

Casio fx-9750GIII

  1. Go to STAT mode
  2. Enter data in columns
  3. Press CALC
  4. Select 1-Variable
  5. Choose your column

Using Sx and Sy to Find Correlation

Sx and Sy become useful when calculating the Pearson correlation coefficient (r). Here's the formula that uses them:

r = Σ(xi - x̄)(yi - ȳ) / √[Σ(xi - x̄)² × Σ(yi - ȳ)²]

This simplifies to:

r = Σ(xi - x̄)(yi - ȳ) / [(n-1) × Sx × Sy]

Once you have r, you can find it squared (r²) to get your coefficient of determination — which tells you what percentage of y's variance is explained by x.

Sx and Sy in Linear Regression

In simple linear regression, Sx shows up in the slope formula:

slope (b) = r × (Sy / Sx)

This relationship is useful. If you already calculated Sx, Sy, and r, you can find your regression line without doing all the extra algebra.

The y-intercept is:

a = ȳ - b(x̄)

Quick Comparison: Manual vs Calculator vs Software

Method Speed Error Risk Best For
By Hand Slow High Learning the concept
TI-84/Casio Fast Low Exams, quick homework
Excel/Sheets Very Fast Very Low Large datasets
Python/R Instant Very Low Research, automation

How to Get Sx and Sy in Excel

Enter your data in two columns. Then use:

=STDEV.S(A2:A101) → gives you Sx for column A

=STDEV.S(B2:B101) → gives you Sy for column B

Use STDEV.P if you have the entire population, not a sample.

Common Mistakes That Mess Up Your Answer

What Sx and Sy Actually Tell You

These values don't mean much on their own. They're useful when you compare them.

If Sx > Sy, your x-variable is more spread out than your y-variable.

If Sy > Sx, your y-variable has more variability.

In regression, a larger Sx in the denominator makes your slope smaller (for the same r value). This is why standardizing your variables matters when comparing effects across different scales.

When You'll Actually Use This

Sx and Sy show up in:

If you're taking stats, you'll see these in almost every chapter from correlation onward.

The Bottom Line

Sx and Sy are just standard deviations for your x and y variables. The calculation is tedious by hand, but your calculator or spreadsheet does it instantly. The real skill is knowing why these values matter — they show up in correlation formulas, regression slopes, and help you understand your data's spread. Get those concepts down and the calculations become secondary.