Zho:求解标准差

标准差，中文环境中又常称“均方差”，标准差能反映一个数据集的离散程度. 平均数相同的两组数据，标准差未必相同. 标准差是一组数据平均值分散程度的一种度量. 一个较大的标准差，代表大部分数值和其平均值之间差异较大；一个较小的标准差，代表这些数值较接近平均值.

标准差的定义
标准差是方差的算术平方根.

什么是方差? 方差（样本方差）是每个样本值与全体样本值的平均数之差的平方值的平均数. 概率论中方差用来度量随机变量和其数学期望（即均值）之间的偏离程度. 我们来看一个例子来说明这一点.

假设我们有5朵花，其高度为25厘米，60厘米，40厘米，45厘米和55厘米. 他们的平均高度是：

(25 + 60 + 40 + 45 + 55) / 5 = 45

这告诉我们花的平均高度是45厘米. 那么，花朵的方差是什么？

Flower #1: ((25) - (45))^2 = (-20)^2 = 400

Flower #2: ((40) - (45))^2 = (-5)^2 = 25

Flower #3: ((45) - (45))^2 = (0)^2 = 0

Flower #4: ((55) - (45))^2 = (10)^2 = 100

Flower #5: ((60) - (45))^2 = (15)^2 = 225

(400 + 25 + 0 + 100 + 225) / 5 = 150

所以花的方差是150厘米. 花的标准偏差因此等于150的平方根，即大约12.247 ...

有两种标准差：

1、总体标准差，针对总体数据的偏差. 例如，如果世界上只有5朵花，那么12.247就是花高度的总体标准差.

2、样本标准差. 样本标准差是只有一部分数据的标准差. 例如：我们拿五朵花. 世界上显然有五朵以上的花，所以五朵花只是全部数据的一部分. 针对从总体抽样，利用样本来计算总体偏差. 就必须将算出的标准偏差的值适度放大.

两种标准偏差之间唯一的区别是如何计算方差. 总体标准差将遵循上例规则，然而，样本标准差将取平均值的平方差的总和，然后除以数据集的数量减1. 例如，让我们回顾一下花朵并重新计算它们的方差：

(400 + 25 + 0 + 100 + 225) / (5 - 1) = 187.5

在这里，5是已知高度的花朵的数量. 从中减去1，因为这是一个样本标准差. 计算样本标准差的最后一步是取187.5的平方根，即大约13.693 ...

变量
在本教程中将需要一个列表:

该列表将包含所有数据样本，如花朵的高度.
 * 数据集列表

同时，还需要七个变量：


 * Average
 * Sum
 * Variance
 * Standard Deviation
 * Number
 * Sum2
 * Number2

代码
本教程先演示计算样本标准差.

计算“样本标准差”的第一步是计算出一些数字的“平均值”. 该脚本如下所示：

when gf clicked set [Sum v] to (0)//初始化变量. set [Number v] to (1) repeat (length of [Data v]) change [Sum v] by (item (Number) of [Data v]) change [Number v] by (1)//变量(Number) 是代码取数据集列表的指针. end set [Average v] to ((Sum) / (length of [Data v]))

计算“样本标准差”的第二步是计算“方差”：

when gf clicked set [Sum v] to (0)//Resetting the variables. set [Number v] to (1) repeat (length of [Data v]) change [Sum v] by (item (Number) of [Data v]) change [Number v] by (1) end set [Average v] to ((Sum) / (length of [Data v])) set [Sum2 v] to (0) set [Number2 v] to (1) repeat (length of [Data v]) change [Sum2 v] by (((item (Number2) of [Data v]) - (Average)) * ((item (Number2) of [Data v]) - (Average))) change [Number2 v] by (1) end set [Variance v] to ((Sum2) / ((length of [Data v]) - (1)))

计算样本标准差的最后一步是取方差的平方根：

when gf clicked set [Sum v] to (0)//Resetting the variables. set [Number v] to (1) repeat (length of [Data v]) change [Sum v] by (item (Number) of [Data v]) change [Number v] by (1) end set [Average v] to ((Sum) / (length of [Data v])) set [Sum2 v] to (0) set [Number2 v] to (1) repeat (length of [Data v]) change [Sum2 v] by (((item (Number2) of [Data v]) - (Average)) * ((item (Number2) of [Data v]) - (Average))) change [Number2 v] by (1) end set [Variance v] to ((Sum2) / ((length of [Data v]) - (1))) set [Standard Deviation v] to ([sqrt v] of (Variance))

计算“总体标准差”，只需要很小的调整，代码如下：

when gf clicked set [Sum v] to (0)//Resetting the variables. set [Number v] to (1) repeat (length of [Data v]) change [Sum v] by (item (Number) of [Data v]) change [Number v] by (1) end set [Average v] to ((Sum) / (length of [Data v])) set [Sum2 v] to (0) set [Number2 v] to (1) repeat (length of [Data v]) change [Sum2 v] by (((item (Number2) of [Data v]) - (Average)) * ((item (Number2) of [Data v]) - (Average))) change [Number2 v] by (1) end set [Variance v] to ((Sum2) / ((length of [Data v]) - (1))) set [Standard Deviation v] to ([sqrt v] of (Variance))

完整代码
计算“样本标准差”的代码是：

when gf clicked set [Sum v] to (0)//Resetting the variables. set [Number v] to (1) repeat (length of [Data v]) change [Sum v] by (item (Number) of [Data v]) change [Number v] by (1) end set [Average v] to ((Sum) / (length of [Data v])) set [Sum2 v] to (0) set [Number2 v] to (1) repeat (length of [Data v]) change [Sum2 v] by (((item (Number2) of [Data v]) - (Average)) * ((item (Number2) of [Data v]) - (Average))) change [Number2 v] by (1) end set [Variance v] to ((Sum2) / ((length of [Data v]) - (1))) set [Standard Deviation v] to ([sqrt v] of (Variance))

计算总体标准差的代码是：

when gf clicked set [Sum v] to (0)//Resetting the variables. set [Number v] to (1) repeat (length of [Data v]) change [Sum v] by (item (Number) of [Data v]) change [Number v] by (1) end set [Average v] to ((Sum) / (length of [Data v])) set [Sum v] to (0)//Resetting the variables. set [Number v] to (1) repeat (length of [Data v]) change [Sum v] by (((item (Number) of [Data v]) - (Average)) * ((item (Number) of [Data v]) - (Average))) change [Number v] by (1) end set [Variance v] to ((Sum) / (length of [Data v])) set [Standard Deviation v] to ([sqrt v] of (Variance))

相关链接

 * Finding the Mode of Numbers
 * Finding the Mean of Numbers
 * Finding the Median of Numbers
 * Finding the Range of Numbers

標準偏差を求める