h1# The Standard Deviation

Measuring the variability!

The standard deviation, like the variance, is a numerical value that indicates how widely the elements of a set vary from the mean. Actually the standard deviation is just the square root of the variance, nothing else.

In the variance formula, you square your inputs: such procedure naturally brings you back a squared result. By taking the square root of the variance you go back to the original "plain" values.

There are no other significant differences between the variance and the standard deviation, from a conceptual point of view. For any further information please refer to my previous article on the variance.

h2## Standard deviation in samples and populations

I've already tackled the differences between a sample and a population. We can, of course, calculate the standard deviation on both cases. The sample standard deviation is represented by the letter s, while the population standard deviation by the symbol σ.

h2## Standard deviation graphically explained

The figure 1. below shows a real world example: the number of beers my imaginary friend Greg bought - and drank - in the past week (8 days). Negative numbers represent beers offered to him. I want to parse those data and calculate the standard deviation, with some final reasoning.

1. Beers bought by Greg last week. Days are on the horizontal axis, number of beers on the vertical one.

First of all this is the whole set of data, which means I'm working with the full population. Hence I compute the population standard deviation:

 σ = sqrt(1/N sum_(i=1)^N (X_i - μ)^2)

Before computing the standard deviation I need to grab the mean (μ) of the values, that is:

 μ = 0.875

Figure 2. below shows the mean as a dotted line. We can say that Greg bought almost one beer a day, on average. At a first glance you can already notice how much some values vary from the mean. Beers like 4, 2 and -2 seem pretty far from it as compared with zeroes and ones. But I'm not satisfied with that: I want to churn out some concrete numbers.

2. The dotted line shows the current mean value of 0.875.

Finally I pull out the standard deviation. I spare you the gory steps, and trust me if I say that its value is:

 σ ~~ 1.8850

Things are getting interesting: we say that 1.8850 is one standard deviation from the mean, often noted as 1SD. It draws a range around the mean, as you can see from the figure 3. below, because the standard deviation is a measure of how spread out numbers are from the mean.

3. +1SD and -1SD ranges.

From a conceptual point of view the standard deviation tells us how "normal", or "standard" a value can be. The more the value drifts from the mean the more its standard deviation increases, and the value becomes "unusual". In our example those 4 beers bought by Greg on the 5th day are +2SD from the mean, which is a less normal value if compared to the whole week.

Standard deviation is also a solid unit of measurement. No more saying "a lot" or "few" or using magic numbers as we did when we computed the variance. Stick to SD units and you will have a formal tool to ponder the variability.

h2## Sources

Liwen Vaughan - Statistical Methods for the Information Professional (link)
Math Is Fun - Standard Deviation and Variance (link)
Numeracy Skills - The Standard Deviation (link)