From Test-Scratch-Wiki

Revision as of 17:58, 19 March 2018 by HY2009 (talk | contribs) (Created page with "'''Finding the Standard Deviation of Numbers''' simply means figuring out how much the numbers deviate from each other, or basically how spread apart a set of numbers is. It i...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Finding the Standard Deviation of Numbers simply means figuring out how much the numbers deviate from each other, or basically how spread apart a set of numbers is. It is a common value used in Statistics.

How to do it in Scratch

Definition of Standard Deviation

Standard deviation is the square root of variance.

What is variance? Variance is the average of the squared differences from the mean. Both standard deviation and variance are measures of spread—that is, how much much a set of numbers vary (of how far they are apart). Let's take a look at an example to make this clearer.

Let's say we have 5 flowers, whose heights are 25 centimeters, 60 centimeters, 40 centimeters, 45 centimeters, and 55 centimeters. The average of their heights is:

(25 + 60 + 40 + 45 + 55) / 5 = 45

This tells us that the average height of the flowers is 45 centimeters. What is the flowers' variance then?

Flower #1: ((25) - (45))^2 = (-20)^2 = 400
Flower #2: ((40) - (45))^2 = (-5)^2 = 25
Flower #3: ((45) - (45))^2 = (0)^2 = 0
Flower #4: ((55) - (45))^2 = (10)^2 = 100
Flower #5: ((60) - (45))^2 = (15)^2 = 225

(400 + 25 + 0 + 100 + 225) / 5 = 150

So the variance of the flowers is 150 centimeters. The standard deviation of the flowers is therefore equal to the square root of 150, or about 12.247...

There are two types of standard deviation though. One type of standard deviation, know as population standard deviation, is the standard deviation of an entire population. For example, if there were only 5 flowers in the world, then 12.247 would be the population standard deviation of the flowers' heights.

The other type of standard deviation is called sample standard deviation. Sample standard deviation is the standard deviation of only part of a population. For example, let's take the five flowers. There are obviously more than 5 flowers in the world, so the five flowers are only part of a population. A sample standard deviation would be needed.

The only difference between the two types of standard deviation is how variance is calculated. Population standard deviation would follow the rules described above. Sample standard deviation, though, would take the sum of the squared differences from the mean, and then divide that by the number of data points minus one. For example, let's take a look back at the flowers and recalculate their variance:

(400 + 25 + 0 + 100 + 225) / (5 - 1) = 187.5

Here, 5 is the number of flowers that have known heights. 1 is subtracted from that because this is a sample standard deviation. The last step in calculating the sample standard deviation is to take the square root of 187.5, which is about 13.693...

Variables

A list will be needed during this tutorial:

  • Data

This list will hold all data samples, like the height of flowers. Meanwhile, seven variables will also be needed:

  • Average
  • Sum
  • Variance
  • Standard Deviation
  • Number
  • Sum2
  • Number2

Coding

The beginning of this tutorial will be for sample standard deviation.

The first step in figuring out sample standard deviation is figuring out the average of some numbers. The script is shown below:

when gf clicked
set [Sum v] to (0)//Resetting the variables.
set [Number v] to (1)
repeat (length of [Data v])
  change [Sum v] by (item (Number) of [Data v])
  change [Number v] by (1)//The variable (Number) helps keep track of what number the script is on.
end
set [Average v] to ((Sum) / (length of [Data v]))

The second step in calculating sample standard deviation is calculating the variance:

when gf clicked
set [Sum v] to (0)//Resetting the variables.
set [Number v] to (1)
repeat (length of [Data v])
  change [Sum v] by (item (Number) of [Data v])
  change [Number v] by (1)
end
set [Average v] to ((Sum) / (length of [Data v]))
set [Sum2 v] to (0)
set [Number2 v] to (1)
repeat (length of [Data v])
  change [Sum2 v] by (((item (Number2) of [Data v]) - (Average)) * ((item (Number2) of [Data v]) - (Average)))
  change [Number2 v] by (1)
end
set [Variance v] to ((Sum2) / ((length of [Data v]) - (1)))

The final step in calculating sample standard deviation is taking the square root of variance:

when gf clicked
set [Sum v] to (0)//Resetting the variables.
set [Number v] to (1)
repeat (length of [Data v])
  change [Sum v] by (item (Number) of [Data v])
  change [Number v] by (1)
end
set [Average v] to ((Sum) / (length of [Data v]))
set [Sum2 v] to (0)
set [Number2 v] to (1)
repeat (length of [Data v])
  change [Sum2 v] by (((item (Number2) of [Data v]) - (Average)) * ((item (Number2) of [Data v]) - (Average)))
  change [Number2 v] by (1)
end
set [Variance v] to ((Sum2) / ((length of [Data v]) - (1)))
set [Standard Deviation v] to ([sqrt v] of (Variance))

Only a small tweak is needed to calculate population standard deviation. The code for that would be:

when gf clicked
set [Sum v] to (0)//Resetting the variables.
set [Number v] to (1)
repeat (length of [Data v])
  change [Sum v] by (item (Number) of [Data v])
  change [Number v] by (1)
end
set [Average v] to ((Sum) / (length of [Data v]))
set [Sum2 v] to (0)
set [Number2 v] to (1)
repeat (length of [Data v])
  change [Sum2 v] by (((item (Number2) of [Data v]) - (Average)) * ((item (Number2) of [Data v]) - (Average)))
  change [Number2 v] by (1)
end
set [Variance v] to ((Sum2) / ((length of [Data v]) - (1)))
set [Standard Deviation v] to ([sqrt v] of (Variance))

Final Product

The code for calculating sample standard deviation is:

when gf clicked
set [Sum v] to (0)//Resetting the variables.
set [Number v] to (1)
repeat (length of [Data v])
  change [Sum v] by (item (Number) of [Data v])
  change [Number v] by (1)
end
set [Average v] to ((Sum) / (length of [Data v]))
set [Sum2 v] to (0)
set [Number2 v] to (1)
repeat (length of [Data v])
  change [Sum2 v] by (((item (Number2) of [Data v]) - (Average)) * ((item (Number2) of [Data v]) - (Average)))
  change [Number2 v] by (1)
end
set [Variance v] to ((Sum2) / ((length of [Data v]) - (1)))
set [Standard Deviation v] to ([sqrt v] of (Variance))

The code for calculating population standard deviation is:

when gf clicked
set [Sum v] to (0)//Resetting the variables.
set [Number v] to (1)
repeat (length of [Data v])
  change [Sum v] by (item (Number) of [Data v])
  change [Number v] by (1)
end
set [Average v] to ((Sum) / (length of [Data v]))
set [Sum v] to (0)//Resetting the variables.
set [Number v] to (1)
repeat (length of [Data v])
  change [Sum v] by (((item (Number) of [Data v]) - (Average)) * ((item (Number) of [Data v]) - (Average)))
  change [Number v] by (1)
end
set [Variance v] to ((Sum) / (length of [Data v]))
set [Standard Deviation v] to ([sqrt v] of (Variance))

See Also