Chapter 11 Language of Descriptive Statistics

Section 11.3 Statistical Measures

11.3.3 Measures of Dispersion

Means and quantiles are measures of position, i.e. they give information on the absolute position of the qualitative values

x_{j}

. If we add a constant

c

to every value

x_{j}

, then the position measures also increase by

c

. In contrast, measures of dispersion are measures that give information on the dispersion or relative distribution of the data values independent of their absolute position. Consider a sample of size

n \geq 2

of a quantitative property

X

. Let the original list be given by

x = (x_{1}, x_{2}, \dots, x_{n}) \in ℝ^{n}

Info 11.3.15

The sample variance of the original list is defined as

s_{x}^{2} = \frac{1}{n - 1} \cdot \sum_{k = 1}^{n} (x_{k} - \overline{x})^{2} = \frac{(x_{1} - \overline{x})^{2} + \dots + (x_{n} - \overline{x})^{2}}{n - 1} .

The sample standard deviation is defined by

s_{x} = + \sqrt{s_{x}^{2}}

The sample variance is a measure of dispersion that describes the variability of the observation sample. The smaller the variance the "closer" the data values lie to each other. A variance

s_{x}^{2} = 0

is only possible if all data values are equal. Typically, it strongly increases with increasing

n

. The standard deviation is a more appropriate measure for the "broadness" of the distribution of data values. The two formulas given above have a few pitfalls:

Before the variance can be calculated the mean $\overline{x}$ must already be known.
The fact that in the definition of $s_{x}^{2}$ is divided by $n - 1$ and not by $n$ is for deeper mathematical reasons that can only be discussed in a statistics lecture.
The notation $s_{x} = + \sqrt{s_{x}^{2}}$ is a little misleading. You must not cancel the square by the square root, since the sum $s_{x}^{2}$ must be calculated (and this value is not defined as a single square) to determine $s_{x}$ .
Be careful using a scientific calculator with statistical functions: the sample variance is available via the $s^{2}$ key. The $σ^{2}$ key, however, provides the sum with denominator $n$ instead of $n - 1$ . This is not the sample standard deviation.

Example 11.3.16

The data sequence

x = (- 1,0, 1)

has the mean

\overline{x} = 0

and the sample standard deviation

s_{x}^{2} = \frac{1}{n - 1} \cdot \sum_{k = 1}^{n} (x_{k} - \overline{x})^{2} = \frac{1}{3 - 1} \cdot ((- 1 - 0)^{2} + (0 - 0)^{2} + (1 - 0)^{2}) = 1 .

Adding further zeros to the data sequence does not change the position measure

\overline{x}

, but the measure of deviation

s_{x}^{2}

,does change since the data values here are more strongly concentrated at the mean. In contrast, shifting all data values by a constant does not change the variance. For example, the data sequence

(- 5, - 4, - 3)

has also variance

1

Exercise 11.3.17

A data sequence (with an unknown number

n

of values) has the measures

\overline{x} = 4

s_{x}^{2} = 10

, and the median

\tilde{x} = 3

. Suppose the values of a second data sequence satisfy the equation

y_{k} = (- 2) \cdot x_{k}

for every

k

. What are its measures?
Answer: the measures are

\overline{y}

=

s_{y}^{2}

=

, and

\tilde{y}

=

.
Hint: recall the definitions of the mean, the sample variance, and the median consider how multiplying all

x

-values by a factor of

(- 2)

influences the entire expression.

Substituting the new

x

-values results in

\begin{matrix} \overline{y} & = & \frac{1}{n} \sum_{k = 1}^{n} y_{k} = \frac{1}{n} \sum_{k = 1}^{n} (- 2) \cdot x_{k} = (- 2) \cdot \frac{1}{n} \sum_{k = 1}^{n} y_{k} = (- 2) \cdot \overline{x} = - 8, \\ s_{y}^{2} & = & \frac{1}{n - 1} \sum_{k = 1}^{n} {(y_{k} - \overline{y})}^{2} = \frac{1}{n - 1} \sum_{k = 1}^{n} {((- 2) x_{k} - (- 2) \overline{x})}^{2} \\ = & \frac{(- 2)^{2}}{n - 1} \sum_{k = 1}^{n} {(x_{k} - \overline{x})}^{2} = (- 2)^{2} \cdot s_{x}^{2} = 40, \\ \tilde{y} & = & (- 2) \tilde{x} = - 6 . \end{matrix}

The conversion of the median uses the fact that a multiplication by a factor of

(- 2)

reverses the ordering of the ordered original list, but the value at the mid position (for an odd number) or the two values at the mid positions (for an even number) stay at their positions and are multiplied by

(- 2)

each.

Onlinebrückenkurs Mathematik

1. Elementary Arithmetic

2. Equations in one Variable

3. Inequalities in one Variable

4. System of Linear Equations

5. Geometry

6. Elementary Functions

7. Differential Calculus

8. Integral Calculus

9. Objects in the Two-Dimensional Coordinate System

10. Basic Concepts of Descriptive Vector Geometry

11. Language of Descriptive Statistics

Chapter 11 Language of Descriptive Statistics

11.3.3 Measures of Dispersion

Info 11.3.15

Example 11.3.16

Exercise 11.3.17