4 Measures of Central Tendency -II

4.1 Geometric mean

The geometric mean is a type of average, usually used for growth rates, like population growth or interest rates. While the arithmetic mean adds items, the geometric mean multiplies items.

The geometric mean of a series containing n observations is the nth root of the product of the values. If \(x_{1},x_{2},\ldots, x_{n}\)are observations then

\[\mathbf{\text{Geometric mean}}\mathbf{,\ }\mathbf{GM =}\sqrt[\mathbf{n}]{\mathbf{x}_{\mathbf{1}}\mathbf{x}_{\mathbf{2}}\mathbf{\ldots}\mathbf{x}_{\mathbf{n}}}\]

\[\mathbf{=}\left( \mathbf{x}_{\mathbf{1}}\mathbf{x}_{\mathbf{2}}\mathbf{\ldots}\mathbf{x}_{\mathbf{n}} \right)^{\frac{\mathbf{1}}{\mathbf{n}}}\]

\[\mathbf{\log}\mathbf{\text{GM}}\mathbf{=}\frac{\mathbf{1}}{\mathbf{n}}\mathbf{\log}\left( \mathbf{x}_{\mathbf{1}}\mathbf{x}_{\mathbf{2}}\mathbf{\ldots}\mathbf{x}_{\mathbf{n}} \right)\]

\[\mathbf{=}\frac{\mathbf{1}}{\mathbf{n}}\left( \mathbf{\log}\mathbf{x}_{\mathbf{1}}\mathbf{+}\mathbf{\log}\mathbf{x}_{\mathbf{2}}\mathbf{\ldots +}\mathbf{\log}\mathbf{x}_{\mathbf{n}} \right)\]

\[\mathbf{=}\frac{\sum_{\mathbf{i = 1}}^{\mathbf{n}}{\mathbf{\log}\mathbf{x}_{\mathbf{i}}}}{\mathbf{n}}\]

\[\mathbf{\ GM = Antilog}\left( \frac{\sum_{\mathbf{i = 1}}^{\mathbf{n}}{\mathbf{\log}\mathbf{x}_{\mathbf{i}}}}{\mathbf{n}} \right)\]

4.1.1 Geometric mean for grouped frequency table data

\[\mathbf{GM = \ Antilog}\left( \frac{\sum_{\mathbf{i = 1}}^{\mathbf{k}}{{\mathbf{f}_{\mathbf{i}}\mathbf{\log}}\mathbf{x}_{\mathbf{i}}}}{\mathbf{n}} \right)\]

where \(x_{i}\) is the mid-value, \(f_{i}\) is the frequency , k is the number of classes

Example 4.1: If the weight of sorghum ear heads are 45, 60, 48,100, 65 gms. Find the Geometric mean?

Weight of ear head (x) log(x)
45 1.653
60 1.778
48 1.681
100 2.000
65 1.813
Total 8.926

Solution 4.1:

Here n =5

Geometric mean=
\[\text{Antilog}\left( \frac{\sum_{i = 1}^{n}{\log x_{i}}}{n} \right) =\]
\[Antilog\left( \frac{8.926}{5} \right) =\]
\[ Antilog(1.785) = 60.95\] note: here \(\text{Antilog}\left( x \right) = 10^{x}\) i.e. \[\text{Antilog}\left( 1.785 \right) = \ 10^{1.785} = 60.95\]

Example 4.2: Geometric mean of a Frequency Distribution

Weight of ear head (x) Frequency(f) log(x) flog(x)
45 5 1.653 8.266
60 4 1.778 7.113
48 6 1.681 10.087
100 8 2.000 16.000
65 9 1.813 16.316
Total 32 57.782

Solution 4.2: Here n =32

\[GM = \ Antilog\left( \frac{\sum_{i = 1}^{k}{{f_{i}\log}x_{i}}}{n} \right)\]

\[{\sum_{i = 1}^{k}{{f_{i}\log}x_{i}} = 57.782 }\]
\[{\text{GM} = \ Antilog\left( \frac{57.782}{32} \right) }\]
\[{= Antilog\left( 1.8056 \right)= 10^{1.8056} = 63.92}\] Example 4.3: Geometric mean of a Grouped Frequency Distribution

Class Mid value (x) Frequency(f) log(x) flog(x)
60-80 70 5 1.845 9.225
80-100 90 4 1.954 7.817
100-120 110 6 2.041 12.248
120-140 130 8 2.114 16.912
140-160 150 9 2.176 19.585
Total 32 65.787

Solution 4.4:
Here n =32

\[GM = \ Antilog\left( \frac{\sum_{i = 1}^{k}{{f_{i}\log}x_{i}}}{n} \right)\]

\[{\sum_{i = 1}^{k}{{f_{i}\log}x_{i}} = 65.787}\] \[{\text{GM} = \ Antilog\left( \frac{65.787}{32} \right)}\] \[{= Antilog\left( 2.0558 \right) = 10^{2.0558} = 113.72}\]

4.1.2 Merits and Demerits of Geometric mean

Merits

  • It is rigidly defined.

  • It is based on all the observations of the series.

  • It is suitable for measuring the relative changes.

  • It gives more weights to the small values and less weight to the large values.

  • It is used in averaging the ratios, percentages and in determining the rate gradual increase and decrease.

  • It is capable of further algebraic treatment.

Demerits

  • It is not easy to understand.

  • It is difficult to calculate.

  • It cannot be calculated, if the number of negative values is odd.

  • It cannot be calculated, if any value of a series is zero.

  • At times it gives a value which may not be found in the series or impractical.

4.2 Harmonic mean

Harmonic means are often used in averaging things like rates (e.g. the average travel speed given duration of several trips). Harmonic mean (HM) of a set of observations is defined as the reciprocal of the arithmetic average of the reciprocal of the given value.

If \(x_{1},\ x_{2},\ldots,\ x_{n}\) are n observations then

\[\mathbf{\text{H.M}} = \frac{n}{\sum_{i = 1}^{n}\frac{1}{x_{i}}}\]

In case of Frequency distribution

\[\mathbf{\text{H.M}} = \frac{n}{\sum_{i = 1}^{k}{f_{i}\frac{1}{x_{i}}}}\]

where \(x_{i}\) is the mid-value, \(f_{i}\) is the frequency , k is the number of classes

4.2.1 Steps in calculating Harmonic Mean (H.M)

  1. Calculate the reciprocal (1/value) for every value.

  2. Find the average of those reciprocals (just add them and divide by how many there are)

  3. Then do the reciprocal of that average (=1/average)

Example 4.4: From the given data 5, 10, 17, 24, 30 calculate H.M

Solution 4.4:

Here n = 5

x 1/x
5 0.2
10 0.1
17 0.058824
24 0.041667
30 0.033333
Total 0.433824

\[\mathbf{\text{H.M}} = \frac{n}{\sum_{i = 1}^{n}\frac{1}{x_{i}}} = \frac{5}{0.433824} = 11.525\]

Example 4.5: Number of tomatoes per plant are given below. Calculate the harmonic mean.

No. of Tomato per plants 20 21 22 23 24 25
No. of Plants 4 2 7 1 3 1

Solution 4.5:

x f 1/x f(1/x)
20 4 0.05 0.2
21 2 0.047619 0.095238
22 7 0.045455 0.318182
23 1 0.043478 0.043478
24 3 0.041667 0.125
25 1 0.04 0.04
18 0.821898

Here n =18

\[\mathbf{\text{H.M}} = \frac{n}{\sum_{i = 1}^{n}{f_{i}\frac{1}{x_{i}}}} = \frac{18}{0.821898} = 21.90\]

4.2.2 Merits and Demerits of Harmonic mean

Merits

  • It is rigidly defined.

  • It is defined on all observations.

  • It is amenable to further algebraic treatment.

  • It is the most suitable average when it is desired to give greater weight to smaller and less weight to the larger ones.

Demerits

  • It is not easily understood.

  • It is difficult to compute.

  • It is only a summary figure and may not be the actual item in the series.

  • It gives greater importance to small items and is therefore, useful only when small items have to be given greater weightage.

  • It is rarely used in grouped data.

4.3 Relation between AM, GM and HM

If AM stands for Arithmetic Mean, GM stands for Geometric Mean and HM stands for Harmonic Mean; then

\[\mathbf{\text{AM}}\mathbf{\times}\mathbf{\text{HM}}\mathbf{=}\mathbf{\text{GM}}^{\mathbf{2}}\]

also

\[\mathbf{AM \geq GM \geq HM}\]

4.4 When to use AM, GM and HM?

A practical answer is that it depends on what your numbers are measuring.

If you are measuring units that add up linearly in a sequence; such as lengths, distances, weights, then an arithmetic mean will give you a meaningful average. For example, the arithmetic mean of the height or weight of students in a class represents the average height or weight of students in the class.

Harmonic mean will give you a meaningful average, if you are measuring units that add up as reciprocals in a sequence; such as speed or distance travelled per unit time, capacitance in series, resistance in parallel. For example, the harmonic mean of capacitors in series represents the capacitance that a single capacitor would have if only one capacitor was used instead of the set of capacitors in series.

If you’re measuring units that multiply in a sequence; such as growth rates or percentages, then a geometric mean will give you a meaningful average. For example, the geometric mean of a sequence of different annual interest rates over 10 years represents an interest rate that, if applied constantly for ten years, would produce the same amount growth in principal as the sequence of different annual interest rates over ten years did.

4.5 Positional Averages

Positional average of a series of values refers to the averages which are taken out from the series itself which represents the whole series or may have some positional properties.

In median, the middle most value of the series is taken as the representative value. Therefore, median is a positional average. Mode is also a positional average as modal values are the most frequently occurring values that are directly taken from the series itself. Other positional averages include Percentiles, Quartiles and Deciles

Note that Arithmetic mean, Harmonic mean and Geometric mean are termed as mathematical averages

4.5.1 Quartiles

The median divides a set of data into two equal parts. We can also divide a set of data into more than two parts. When an ordered set of data is divided into four equal parts, the division points are called quartiles.

The first or lower quartile (\(\mathbf{Q}_{\mathbf{1}}\)) is a value that has one fourth, or 25% of the observations below its value.

The second quartile (\(\mathbf{Q}_{\mathbf{2}}\)), has one-half, or 50% of the observations below its value. The second quartile is equal to the median.

The third or upper quartile, (\(\mathbf{Q}_{\mathbf{3}}\)), is a value that has three-fourths, or 75% of the observations below it.

\(\mathbf{Q}_{\mathbf{1}}\mathbf{=}\left( \frac{\mathbf{n + 1}}{\mathbf{4}} \right)^{\mathbf{\text{th}}}\)item

\(\mathbf{Q}_{\mathbf{3}}\mathbf{=}\left( \frac{\mathbf{3(n + 1)}}{\mathbf{4}} \right)^{\mathbf{\text{th}}}\)item

Calculations of quartiles are explained using the example below. See in the example the procedure followed when a fraction appear in the calculation.

Example 4.6: Compute quartiles for the data 25, 18, 30, 8, 15, 5, 10, 35, 40, 45

Solution 4.6:

First arrange the data in ascending order

5, 8, 10, 15, 18, 25, 30, 35, 40, 45

here n = 10

\(\mathbf{Q}_{\mathbf{1}}\mathbf{=}\left(\frac{\mathbf{n + 1}}{\mathbf{4}} \right)^{\mathbf{\text{th}}}\)item

i.e. \(Q_{1} = \left( \frac{10 + 1}{4} \right)^{th}\) = 2.75th item; when such a fraction appears we use the following procedure

\(Q_{1} =\)2.75th item = 2nd item + 0.75(3rd item – 2nd item)

So from the given data \(Q_{1}\)= 8+0.75(10– 8) = 9.5

\[\mathbf{Q}_{\mathbf{2}}\mathbf{= median}\]

here \(Q_{2} =\)(18+25)/2 = 21.5

\(\mathbf{Q}_{\mathbf{3}}\mathbf{=}\left( \frac{\mathbf{3(n + 1)}}{\mathbf{4}} \right)^{\mathbf{\text{th}}}\)item

i.e. \(Q_{3} = \left( 3 \times \frac{(10 + 1)}{4} \right)^{th}\) = 8.25th item = 8th item + 0.25(9th item –8th item) = 35+0.25(40-35) =36.25

4.5.1.1 Quartiles of a discrete frequency data

  1. Find cumulative frequencies.

  2. Find \(\left( \frac{\mathbf{n + 1}}{\mathbf{4}} \right)\)

  3. See in the cumulative frequencies, the value just greater than \(\left( \frac{\mathbf{n + 1}}{\mathbf{4}} \right)\) , then the corresponding value of \(x\) is \(Q_{1}\)

  4. Find \(\left( \frac{\mathbf{3(n + 1)}}{\mathbf{4}} \right)\)

  5. See in the cumulative frequencies, the value just greater than \(\left( \frac{\mathbf{3(n + 1)}}{\mathbf{4}} \right)\) ,then the corresponding value of \(x\) is \(Q_{3}\)

Example 4.7: Compute quartiles for the data given bellow

\(\mathbf{x}\) 5 8 12 15 19 24 30
\(\mathbf{f}\) 4 3 2 4 5 2 4

Solution 4.7:

x f cf
5 4 4
8 3 7
12 2 9
15 4 13
19 5 18
24 2 20
30 4 24

Here n =24

\(\left( \frac{\mathbf{n + 1}}{\mathbf{4}} \right)\) = \(\left( \frac{\mathbf{n + 1}}{\mathbf{4}} \right)\mathbf{\ }\)= \(\left( \frac{\mathbf{25}}{\mathbf{4}} \right)\)= 6.25

The cumulative frequency value just greater than 6.25 is 7, the
\(\mathbf{x}\) value corresponding to cumulative frequency 7 is 8. So \(\mathbf{Q}_{\mathbf{1}}\)= 8

\(\left( \frac{\mathbf{3(n + 1)}}{\mathbf{4}} \right)\) = \(\left( \frac{\mathbf{3}\mathbf{\times}\mathbf{25}}{\mathbf{4}} \right)\)= 18.75

The cumulative frequency value just greater than 18.75 is 20, the
\(\mathbf{x}\) value corresponding to cumulative frequency 20 is 24. So \(\mathbf{Q}_{\mathbf{3}}\)= 24

4.5.1.2 Quartiles of a continuous frequency data

  1. Find cumulative frequencies

  2. Find \(\left( \frac{\mathbf{n}}{\mathbf{4}} \right)\)

  3. See in the cumulative frequencies, the value just greater than\(\ \left( \frac{\mathbf{n}}{\mathbf{4}} \right)\), and then the corresponding class interval is called first quartile class.

  4. Find \(3\left( \frac{\mathbf{n}}{\mathbf{4}} \right)\)

  5. See in the cumulative frequencies the value just greater than \(3\left( \frac{\mathbf{n}}{\mathbf{4}} \right)\mathbf{\ }\)then the corresponding class interval is called 3rd quartile class. Then apply the respective formulae

\[\mathbf{Q}_{\mathbf{1}}\mathbf{=}\mathbf{l}_{\mathbf{1}}\mathbf{+}\frac{\frac{\mathbf{n}}{\mathbf{4}}\mathbf{-}\mathbf{m}_{\mathbf{1}}}{\mathbf{f}_{\mathbf{1}}}\mathbf{\times}\mathbf{c}_{\mathbf{1}}\]

\[\mathbf{Q}_{\mathbf{3}}\mathbf{=}\mathbf{l}_{\mathbf{3}}\mathbf{+}\frac{\mathbf{3}\left( \frac{\mathbf{n}}{\mathbf{4}} \right)\mathbf{-}\mathbf{m}_{\mathbf{3}}}{\mathbf{f}_{\mathbf{3}}}\mathbf{\times}\mathbf{c}_{\mathbf{3}}\]

Where, \(l_{1}\) = lower limit of the first quartile class

\(f_{1}\) = frequency of the first quartile class

\(c_{1}\) = width of the first quartile class

\(m_{1}\) = cumulative frequency preceding the first quartile class

\(l_{3}\)= 1ower limit of the 3rd quartile class

\(f_{3}\)= frequency of the 3rd quartile class

\(c_{3}\)= width of the 3rd quartile class

\(m_{3}\) = cumulative frequency preceding the 3rd quartile class

Example 4.8: Find the quartiles for the grouped frequency data given

Class frequency cumulative frequency
0-10 11 11
10-20 18 29
20-30 25 54
30-40 28 82
40-50 30 112
50-60 33 145
60-70 22 167
70-80 15 182
80-90 12 194
90-100 10 204

Solution 4.8:

\(\left( \frac{n}{4} \right)\) = \(\frac{204}{4}\) = 51

The cumulative frequency value just greater than 51 is 54 so the class 20-30 is the 1st quartile class

\[\mathbf{Q}_{\mathbf{1}}\mathbf{=}\mathbf{l}_{\mathbf{1}}\mathbf{+}\frac{\frac{\mathbf{n}}{\mathbf{4}}\mathbf{-}\mathbf{m}_{\mathbf{1}}}{\mathbf{f}_{\mathbf{1}}}\mathbf{\times}\mathbf{c}_{\mathbf{1}}\]

\[\mathbf{= 20 +}\frac{\mathbf{51 - 29}}{\mathbf{25}}\mathbf{\times 10\ = 28.8}\]

\(3\left( \frac{n}{4} \right)\)= \(3 \times \frac{204}{4}\) = 153

The cumulative frequency value just greater than 153 is 167 so the class 60-70 is the 3rd quartile class

\[\mathbf{Q}_{\mathbf{3}}\mathbf{=}\mathbf{l}_{\mathbf{3}}\mathbf{+}\frac{\mathbf{3}\left( \frac{\mathbf{n}}{\mathbf{4}} \right)\mathbf{-}\mathbf{m}_{\mathbf{3}}}{\mathbf{f}_{\mathbf{3}}}\mathbf{\times}\mathbf{c}_{\mathbf{3}}\]

\[\mathbf{= 60 +}\frac{\mathbf{153 - 145}}{\mathbf{22}}\mathbf{\times 10 = 63.63}\]

4.5.2 Percentiles

The percentile values divide an ordered set of data into 100 equal parts each containing 1 percent of the observations. The xth percentile, denoted as \(P_{x}\) is that value below which x percent of values in the distribution fall. It may be noted that the median is the 50th percentile, 25th percentile is first quartile \(Q_{1}\) and 75th percentile is$_{3}$

For raw data, first arrange the n observations in increasing order. Then the xth percentile is given by

\(\mathbf{P}_{\mathbf{x}}\mathbf{=}\left( \frac{\mathbf{x}\left( \mathbf{n + 1} \right)}{\mathbf{100}} \right)^{\mathbf{\text{th}}}\)item

For a frequency distribution the xth percentile is given by following steps

  1. Find cumulative frequencies

  2. Find \(\left( \frac{\text{x.n}}{100} \right)\)

  3. See in the cumulative frequencies, the value just greater than\(\left( \frac{\text{x.n}}{100} \right)\)and then the corresponding class interval is called Percentile class.

  4. Use the following formula

\[\mathbf{P}_{\mathbf{x}}\mathbf{= l +}\frac{\left( \frac{\mathbf{x \times n}}{\mathbf{100}} \right)\mathbf{- cf}}{\mathbf{f}}\mathbf{\times c}\]

Where

\(\mathbf{l}\) = lower limit of the percentile class

\(\mathbf{\text{cf}}\) = cumulative frequency preceding the percentile class

\(\mathbf{f}\) = frequency of the percentile class

\(\mathbf{c}\) = class interval

\(\mathbf{n}\) = total number of observations

Example 4.9: Compute \(\mathbf{P}_{\mathbf{25}}\)and \(\mathbf{P}_{\mathbf{75}}\) for the data 25, 18, 30, 8, 15, 5, 10, 35, 40, 45

Solution 4.9:

First arrange the data in ascending order

5, 8, 10, 15, 18, 25, 30, 35, 40, 45

Here n =10

\(\mathbf{P}_{\mathbf{25}}\mathbf{=}\left( \frac{\mathbf{25}\left( \mathbf{10 + 1} \right)}{\mathbf{100}} \right)^{\mathbf{\text{th}}}\)= 2.75th item

\(P_{25} =\)2.75th item = 2nd item + 0.75(3rd item – 2nd item)

So from the given data \(P_{25}\)= 8+0.75(10– 8) = 9.5

\(\mathbf{P}_{\mathbf{75}}\mathbf{=}\left( \frac{\mathbf{75}\left( \mathbf{10 + 1} \right)}{\mathbf{100}} \right)^{\mathbf{\text{th}}}\)= 8.25th item

i.e. \(P_{75} = \left( 75 \times \frac{10 + 1}{100} \right)^{th}\) = 8.25th item = 8th item + 0.25(9th item –8th item) = 35+0.25(40-35) =36.25

Note: Data in this example is same as Example 3.6; it can be seen that \(P_{25} = Q_{1}\) & \(P_{75} = Q_{3}\) always

Assignment 1: Find \(\text{P}_{25}\), \(P_{50}\)& \(P_{75}\)for Example 4.7 & 4.8; verify that \(P_{50} = Q_{2}\), \(P_{25} = Q_{1}\) & \(P_{75} = Q_{3}\)

4.5.3 Deciles

Deciles are similar to quartiles. But while quartiles are three points that divide an ordered set of data into four quarters, deciles are 9 points that divide an ordered set of data into ten equal parts. The xth decile is denoted as\(\text{d}_{x}\). It may be noted that the median is the 5thdecile.

\(\mathbf{d}_{\mathbf{x}}\mathbf{=}\left( \frac{\mathbf{x}\left( \mathbf{n + 1} \right)}{\mathbf{10}} \right)^{\mathbf{\text{th}}}\)item

For a frequency distribution the xth decile is given by following steps

  1. Find cumulative frequencies

  2. Find \(\left( \frac{\text{x.n}}{10} \right)\)

  3. See in the cumulative frequencies, the value just greater than\(\left( \frac{\text{x.n}}{10} \right)\)and then the corresponding class interval is called decile class.

  4. Use the following formula

\[\mathbf{d}_{\mathbf{x}}\mathbf{= l +}\frac{\left( \frac{\mathbf{x \times n}}{\mathbf{10}} \right)\mathbf{- cf}}{\mathbf{f}}\mathbf{\times c}\]

Where

\(\mathbf{l}\) = lower limit of the decile class

\(\mathbf{\text{cf}}\) = cumulative frequency preceding the decile class

\(\mathbf{f}\) = frequency of the decile class

\(\mathbf{c}\) = class interval

\(\mathbf{n}\) = total number of observations

Assignment 2: Find \(\text{D}_{5}\) for Example 4.6, 4.7 & 4.8; verify that \(\text{D}_{5} = \text{Q}_{2} = \text{P}_{50} = median\)

 
 
 

“The best thing about being a statistician is that you get to play in everybody else’s backyard”. – John Tukey