<< Chapter < Page | Chapter >> Page > |
A common question at this point is "Why is the numerator squared?" One answer is: to get rid of the negative signs. Numbers are going to fall above and below the mean and, since the variance is looking for distance, it would be counterproductive if those distances factored each other out.
As seen a distinction is made between the variance, ${\sigma}^{2}$ , of a whole population and the variance, ${s}^{2}$ of a sample extracted from the population.
When dealing with the complete population the (population) variance is a constant, a parameter which helps to describe the population. When dealing with a sample from the population the (sample) variance varies from sample to sample. Its value is only of interest as an estimate for the population variance.
The variance is never negative because the squares are always positive or zero. The unit of variance is the square of the unit of observation. For example, the variance of a set of heights measured in centimeters will be given in square centimeters. This fact is inconvenient and has motivated many statisticians to instead use the square root of the variance, known as the standard deviation, as a summary of dispersion.
Since the variance is a squared quantity, it cannot be directly compared to the data values or the mean value of a data set. It is therefore more useful to have a quantity which is the square root of the variance. This quantity is known as the standard deviation.
In statistics, the standard deviation is the most common measure of statistical dispersion. Standard deviation measures how spread out the values in a data set are. More precisely, it is a measure of the average distance between the values of the data in the set and the mean. If the data values are all similar, then the standard deviation will be low (closer to zero). If the data values are highly variable, then the standard variation is high (further from zero).
The standard deviation is always a positive number and is always measured in the same units as the original data. For example, if the data are distance measurements in metres, the standard deviation will also be measured in metres.
Let the population consist of $n$ elements $\{{x}_{1},{x}_{2},...,{x}_{n}\}$ , with mean $\overline{x}$ . The standard deviation of the population, denoted by $\sigma $ , is the square root of the average of the square of the distance of each data value from the mean value.
Let the sample consist of $n$ elements $\{{x}_{1},{x}_{2},...,{x}_{n}\}$ , taken from the population, with mean $\overline{x}$ . The standard deviation of the sample, denoted by $s$ , is the square root of the average of the squared deviations from the sample mean:
It is often useful to set your data out in a table so that you can apply the formulae easily. For example to calculate the standard deviation of $\{57;53;58;65;48;50;66;51\}$ , you could set it out in the following way:
Note: To get the deviations, subtract each number from the mean.
Notification Switch
Would you like to follow the 'Siyavula textbooks: grade 11 maths' conversation and receive update notifications?