4 Measures of Central Tendency -II
4.1 Geometric mean
The geometric mean is a type of average, usually used for growth rates, like population growth or interest rates. While the arithmetic mean adds items, the geometric mean multiplies items.
The geometric mean of a series containing n observations is the nth root of the product of the values. If \(x_{1},x_{2},\ldots, x_{n}\)are observations then
\[\mathbf{\text{Geometric mean}}\mathbf{,\ }\mathbf{GM =}\sqrt[\mathbf{n}]{\mathbf{x}_{\mathbf{1}}\mathbf{x}_{\mathbf{2}}\mathbf{\ldots}\mathbf{x}_{\mathbf{n}}}\]
\[\mathbf{=}\left( \mathbf{x}_{\mathbf{1}}\mathbf{x}_{\mathbf{2}}\mathbf{\ldots}\mathbf{x}_{\mathbf{n}} \right)^{\frac{\mathbf{1}}{\mathbf{n}}}\]
\[\mathbf{\log}\mathbf{\text{GM}}\mathbf{=}\frac{\mathbf{1}}{\mathbf{n}}\mathbf{\log}\left( \mathbf{x}_{\mathbf{1}}\mathbf{x}_{\mathbf{2}}\mathbf{\ldots}\mathbf{x}_{\mathbf{n}} \right)\]
\[\mathbf{=}\frac{\mathbf{1}}{\mathbf{n}}\left( \mathbf{\log}\mathbf{x}_{\mathbf{1}}\mathbf{+}\mathbf{\log}\mathbf{x}_{\mathbf{2}}\mathbf{\ldots +}\mathbf{\log}\mathbf{x}_{\mathbf{n}} \right)\]
\[\mathbf{=}\frac{\sum_{\mathbf{i = 1}}^{\mathbf{n}}{\mathbf{\log}\mathbf{x}_{\mathbf{i}}}}{\mathbf{n}}\]
\[\mathbf{\ GM = Antilog}\left( \frac{\sum_{\mathbf{i = 1}}^{\mathbf{n}}{\mathbf{\log}\mathbf{x}_{\mathbf{i}}}}{\mathbf{n}} \right)\]
4.1.1 Geometric mean for grouped frequency table data
\[\mathbf{GM = \ Antilog}\left( \frac{\sum_{\mathbf{i = 1}}^{\mathbf{k}}{{\mathbf{f}_{\mathbf{i}}\mathbf{\log}}\mathbf{x}_{\mathbf{i}}}}{\mathbf{n}} \right)\]
where \(x_{i}\) is the mid-value, \(f_{i}\) is the frequency , k is the number of classes
Example 4.1: If the weight of sorghum ear heads are 45, 60, 48,100, 65 gms. Find the Geometric mean?
Weight of ear head (x) | log(x) |
---|---|
45 | 1.653 |
60 | 1.778 |
48 | 1.681 |
100 | 2.000 |
65 | 1.813 |
Total | 8.926 |
Solution 4.1:
Here n =5
Geometric mean=
\[\text{Antilog}\left( \frac{\sum_{i = 1}^{n}{\log x_{i}}}{n} \right) \]
\[=\text{Antilog}\left( \frac{8.926}{5} \right) \]
\[ =\text{Antilog}(1.785) = 60.95\] note: here
\(\text{Antilog}\left( x \right) = 10^{x}\) i.e.
\[\text{Antilog}\left( 1.785 \right) = \ 10^{1.785} = 60.95\]
Example 4.2: Geometric mean of a Frequency Distribution
Weight of ear head (x) | Frequency(f) | log(x) | flog(x) |
---|---|---|---|
45 | 5 | 1.653 | 8.266 |
60 | 4 | 1.778 | 7.113 |
48 | 6 | 1.681 | 10.087 |
100 | 8 | 2.000 | 16.000 |
65 | 9 | 1.813 | 16.316 |
Total | 32 | 57.782 |
Solution 4.2: Here n =32
\[GM = \ Antilog\left( \frac{\sum_{i = 1}^{k}{{f_{i}\log}x_{i}}}{n} \right)\]
\[{\sum_{i = 1}^{k}{{f_{i}\log}x_{i}} = 57.782
}\]
\[{\text{GM} = \ Antilog\left( \frac{57.782}{32} \right) }\]
\[{= Antilog\left( 1.8056 \right)= 10^{1.8056} = 63.92}\] Example
4.3: Geometric mean of a Grouped Frequency Distribution
Class | Mid value (x) | Frequency(f) | log(x) | flog(x) |
---|---|---|---|---|
60-80 | 70 | 5 | 1.845 | 9.225 |
80-100 | 90 | 4 | 1.954 | 7.817 |
100-120 | 110 | 6 | 2.041 | 12.248 |
120-140 | 130 | 8 | 2.114 | 16.912 |
140-160 | 150 | 9 | 2.176 | 19.585 |
Total | 32 | 65.787 |
Solution 4.4:
Here n =32
\[GM = \ Antilog\left( \frac{\sum_{i = 1}^{k}{{f_{i}\log}x_{i}}}{n} \right)\]
\[{\sum_{i = 1}^{k}{{f_{i}\log}x_{i}} = 65.787}\] \[{\text{GM} = \ Antilog\left( \frac{65.787}{32} \right)}\] \[{= Antilog\left( 2.0558 \right) = 10^{2.0558} = 113.71}\]
4.1.2 Merits and Demerits of Geometric mean
Merits
It is rigidly defined.
It is based on all the observations of the series.
It is suitable for measuring the relative changes.
It gives more weights to the small values and less weight to the large values.
It is used in averaging the ratios, percentages and in determining the rate gradual increase and decrease.
It is capable of further algebraic treatment.
Demerits
It is not easy to understand.
It is difficult to calculate.
It cannot be calculated, if the number of negative values is odd.
It cannot be calculated, if any value of a series is zero.
At times it gives a value which may not be found in the series or impractical.
4.2 Harmonic mean
Harmonic means are often used in averaging things like rates (e.g. the average travel speed given duration of several trips). Harmonic mean (HM) of a set of observations is defined as the reciprocal of the arithmetic average of the reciprocal of the given value.
If \(x_{1},\ x_{2},\ldots,\ x_{n}\) are n observations then
\[\mathbf{\text{H.M}} = \frac{n}{\sum_{i = 1}^{n}\frac{1}{x_{i}}}\]
In case of Frequency distribution
\[\mathbf{\text{H.M}} = \frac{n}{\sum_{i = 1}^{k}{f_{i}\frac{1}{x_{i}}}}\]
where \(x_{i}\) is the mid-value, \(f_{i}\) is the frequency , k is the number of classes
4.2.1 Steps in calculating Harmonic Mean (H.M)
Calculate the reciprocal (1/value) for every value.
Find the average of those reciprocals (just add them and divide by how many there are)
Then do the reciprocal of that average (=1/average)
Example 4.4: From the given data 5, 10, 17, 24, 30 calculate H.M
Solution 4.4:
Here n = 5
x | 1/x |
---|---|
5 | 0.2 |
10 | 0.1 |
17 | 0.058824 |
24 | 0.041667 |
30 | 0.033333 |
Total | 0.433824 |
\[\mathbf{\text{H.M}} = \frac{n}{\sum_{i = 1}^{n}\frac{1}{x_{i}}} = \frac{5}{0.433824} = 11.525\]
Example 4.5: Number of tomatoes per plant are given below. Calculate the harmonic mean.
No. of Tomato per plants | 20 | 21 | 22 | 23 | 24 | 25 |
---|---|---|---|---|---|---|
No. of Plants | 4 | 2 | 7 | 1 | 3 | 1 |
Solution 4.5:
x | f | 1/x | f(1/x) |
---|---|---|---|
20 | 4 | 0.05 | 0.2 |
21 | 2 | 0.047619 | 0.095238 |
22 | 7 | 0.045455 | 0.318182 |
23 | 1 | 0.043478 | 0.043478 |
24 | 3 | 0.041667 | 0.125 |
25 | 1 | 0.04 | 0.04 |
18 | 0.821898 |
Here n =18
\[\mathbf{\text{H.M}} = \frac{n}{\sum_{i = 1}^{n}{f_{i}\frac{1}{x_{i}}}} = \frac{18}{0.821898} = 21.90\]
4.2.2 Merits and Demerits of Harmonic mean
Merits
It is rigidly defined.
It is defined on all observations.
It is amenable to further algebraic treatment.
It is the most suitable average when it is desired to give greater weight to smaller and less weight to the larger ones.
Demerits
It is not easily understood.
It is difficult to compute.
It is only a summary figure and may not be the actual item in the series.
It gives greater importance to small items and is therefore, useful only when small items have to be given greater weightage.
It is rarely used in grouped data.
4.3 Relation between AM, GM and HM
If AM stands for Arithmetic Mean, GM stands for Geometric Mean and HM stands for Harmonic Mean; then
\[\mathbf{\text{AM}}\mathbf{\times}\mathbf{\text{HM}}\mathbf{=}\mathbf{\text{GM}}^{\mathbf{2}}\]
also
\[\mathbf{AM \geq GM \geq HM}\]
4.4 When to use AM, GM and HM?
A practical answer is that it depends on what your numbers are measuring.
If you are measuring units that add up linearly in a sequence; such as lengths, distances, weights, then an arithmetic mean will give you a meaningful average. For example, the arithmetic mean of the height or weight of students in a class represents the average height or weight of students in the class.
Harmonic mean will give you a meaningful average, if you are measuring units that add up as reciprocals in a sequence; such as speed or distance travelled per unit time, capacitance in series, resistance in parallel. For example, the harmonic mean of capacitors in series represents the capacitance that a single capacitor would have if only one capacitor was used instead of the set of capacitors in series.
If you’re measuring units that multiply in a sequence; such as growth rates or percentages, then a geometric mean will give you a meaningful average. For example, the geometric mean of a sequence of different annual interest rates over 10 years represents an interest rate that, if applied constantly for ten years, would produce the same amount growth in principal as the sequence of different annual interest rates over ten years did.
4.5 Positional Averages
Positional average of a series of values refers to the averages which are taken out from the series itself which represents the whole series or may have some positional properties.
In median, the middle most value of the series is taken as the representative value. Therefore, median is a positional average. Mode is also a positional average as modal values are the most frequently occurring values that are directly taken from the series itself. Other positional averages include Percentiles, Quartiles and Deciles
Note that Arithmetic mean, Harmonic mean and Geometric mean are termed as mathematical averages
4.5.1 Quartiles
The median divides a set of data into two equal parts. We can also divide a set of data into more than two parts. When an ordered set of data is divided into four equal parts, the division points are called quartiles.
The first or lower quartile (\(\mathbf{Q}_{\mathbf{1}}\)) is a value that has one fourth, or 25% of the observations below its value.
The second quartile (\(\mathbf{Q}_{\mathbf{2}}\)), has one-half, or 50% of the observations below its value. The second quartile is equal to the median.
The third or upper quartile, (\(\mathbf{Q}_{\mathbf{3}}\)), is a value that has three-fourths, or 75% of the observations below it.
\(\mathbf{Q}_{\mathbf{1}}\mathbf{=}\left( \frac{\mathbf{n + 1}}{\mathbf{4}} \right)^{\mathbf{\text{th}}}\)item
\(\mathbf{Q}_{\mathbf{3}}\mathbf{=}\left( \frac{\mathbf{3(n + 1)}}{\mathbf{4}} \right)^{\mathbf{\text{th}}}\)item
Calculations of quartiles are explained using the example below. See in the example the procedure followed when a fraction appear in the calculation.
Example 4.6: Compute quartiles for the data 25, 18, 30, 8, 15, 5, 10, 35, 40, 45
Solution 4.6:
First arrange the data in ascending order
5, 8, 10, 15, 18, 25, 30, 35, 40, 45
here n = 10
\(\mathbf{Q}_{\mathbf{1}}\mathbf{=}\left(\frac{\mathbf{n + 1}}{\mathbf{4}} \right)^{\mathbf{\text{th}}}\)item
i.e. \(Q_{1} = \left( \frac{10 + 1}{4} \right)^{th}\) = 2.75th item; when such a fraction appears we use the following procedure
\(Q_{1} =\)2.75th item = 2nd item + 0.75(3rd item – 2nd item)
So from the given data \(Q_{1}\)= 8+0.75(10– 8) = 9.5
\[\mathbf{Q}_{\mathbf{2}}\mathbf{= median}\]
here \(Q_{2} =\)(18+25)/2 = 21.5
\(\mathbf{Q}_{\mathbf{3}}\mathbf{=}\left( \frac{\mathbf{3(n + 1)}}{\mathbf{4}} \right)^{\mathbf{\text{th}}}\)item
i.e. \(Q_{3} = \left( 3 \times \frac{(10 + 1)}{4} \right)^{th}\) = 8.25th item = 8th item + 0.25(9th item –8th item) = 35+0.25(40-35) =36.25
4.5.1.1 Quartiles of a discrete frequency data
Find cumulative frequencies.
Find \(\left( \frac{\mathbf{n + 1}}{\mathbf{4}} \right)\)
See in the cumulative frequencies, the value just greater than \(\left( \frac{\mathbf{n + 1}}{\mathbf{4}} \right)\) , then the corresponding value of \(x\) is \(Q_{1}\)
Find \(\left( \frac{\mathbf{3(n + 1)}}{\mathbf{4}} \right)\)
See in the cumulative frequencies, the value just greater than \(\left( \frac{\mathbf{3(n + 1)}}{\mathbf{4}} \right)\) ,then the corresponding value of \(x\) is \(Q_{3}\)
Example 4.7: Compute quartiles for the data given bellow
\(\mathbf{x}\) | 5 | 8 | 12 | 15 | 19 | 24 | 30 |
---|---|---|---|---|---|---|---|
\(\mathbf{f}\) | 4 | 3 | 2 | 4 | 5 | 2 | 4 |
Solution 4.7:
x | f | cf |
---|---|---|
5 | 4 | 4 |
8 | 3 | 7 |
12 | 2 | 9 |
15 | 4 | 13 |
19 | 5 | 18 |
24 | 2 | 20 |
30 | 4 | 24 |
Here n =24
\(\left( \frac{\mathbf{n + 1}}{\mathbf{4}} \right)\) = \(\left( \frac{\mathbf{n + 1}}{\mathbf{4}} \right)\mathbf{\ }\)= \(\left( \frac{\mathbf{25}}{\mathbf{4}} \right)\)= 6.25
The cumulative frequency value just greater than 6.25 is 7, the
\(\mathbf{x}\) value corresponding to cumulative frequency 7 is 8. So
\(\mathbf{Q}_{\mathbf{1}}\)= 8
\(\left( \frac{\mathbf{3(n + 1)}}{\mathbf{4}} \right)\) = \(\left( \frac{\mathbf{3}\mathbf{\times}\mathbf{25}}{\mathbf{4}} \right)\)= 18.75
The cumulative frequency value just greater than 18.75 is 20, the
\(\mathbf{x}\) value corresponding to cumulative frequency 20 is 24. So
\(\mathbf{Q}_{\mathbf{3}}\)= 24
4.5.1.2 Quartiles of a continuous frequency data
Find cumulative frequencies
Find \(\left( \frac{\mathbf{n}}{\mathbf{4}} \right)\)
See in the cumulative frequencies, the value just greater than\(\ \left( \frac{\mathbf{n}}{\mathbf{4}} \right)\), and then the corresponding class interval is called first quartile class.
Find \(3\left( \frac{\mathbf{n}}{\mathbf{4}} \right)\)
See in the cumulative frequencies the value just greater than \(3\left( \frac{\mathbf{n}}{\mathbf{4}} \right)\mathbf{\ }\)then the corresponding class interval is called 3rd quartile class. Then apply the respective formulae
\[\mathbf{Q}_{\mathbf{1}}\mathbf{=}\mathbf{l}_{\mathbf{1}}\mathbf{+}\frac{\frac{\mathbf{n}}{\mathbf{4}}\mathbf{-}\mathbf{m}_{\mathbf{1}}}{\mathbf{f}_{\mathbf{1}}}\mathbf{\times}\mathbf{c}_{\mathbf{1}}\]
\[\mathbf{Q}_{\mathbf{3}}\mathbf{=}\mathbf{l}_{\mathbf{3}}\mathbf{+}\frac{\mathbf{3}\left( \frac{\mathbf{n}}{\mathbf{4}} \right)\mathbf{-}\mathbf{m}_{\mathbf{3}}}{\mathbf{f}_{\mathbf{3}}}\mathbf{\times}\mathbf{c}_{\mathbf{3}}\]
Where, \(l_{1}\) = lower limit of the first quartile class
\(f_{1}\) = frequency of the first quartile class
\(c_{1}\) = width of the first quartile class
\(m_{1}\) = cumulative frequency preceding the first quartile class
\(l_{3}\)= 1ower limit of the 3rd quartile class
\(f_{3}\)= frequency of the 3rd quartile class
\(c_{3}\)= width of the 3rd quartile class
\(m_{3}\) = cumulative frequency preceding the 3rd quartile class
Example 4.8: Find the quartiles for the grouped frequency data given
Class | frequency | cumulative frequency |
---|---|---|
0-10 | 11 | 11 |
10-20 | 18 | 29 |
20-30 | 25 | 54 |
30-40 | 28 | 82 |
40-50 | 30 | 112 |
50-60 | 33 | 145 |
60-70 | 22 | 167 |
70-80 | 15 | 182 |
80-90 | 12 | 194 |
90-100 | 10 | 204 |
Solution 4.8:
\(\left( \frac{n}{4} \right)\) = \(\frac{204}{4}\) = 51
The cumulative frequency value just greater than 51 is 54 so the class 20-30 is the 1st quartile class
\[\mathbf{Q}_{\mathbf{1}}\mathbf{=}\mathbf{l}_{\mathbf{1}}\mathbf{+}\frac{\frac{\mathbf{n}}{\mathbf{4}}\mathbf{-}\mathbf{m}_{\mathbf{1}}}{\mathbf{f}_{\mathbf{1}}}\mathbf{\times}\mathbf{c}_{\mathbf{1}}\]
\[\mathbf{= 20 +}\frac{\mathbf{51 - 29}}{\mathbf{25}}\mathbf{\times 10\ = 28.8}\]
\(3\left( \frac{n}{4} \right)\)= \(3 \times \frac{204}{4}\) = 153
The cumulative frequency value just greater than 153 is 167 so the class 60-70 is the 3rd quartile class
\[\mathbf{Q}_{\mathbf{3}}\mathbf{=}\mathbf{l}_{\mathbf{3}}\mathbf{+}\frac{\mathbf{3}\left( \frac{\mathbf{n}}{\mathbf{4}} \right)\mathbf{-}\mathbf{m}_{\mathbf{3}}}{\mathbf{f}_{\mathbf{3}}}\mathbf{\times}\mathbf{c}_{\mathbf{3}}\]
\[\mathbf{= 60 +}\frac{\mathbf{153 - 145}}{\mathbf{22}}\mathbf{\times 10 = 63.63}\]
4.5.2 Percentiles
The percentile values divide an ordered set of data into 100 equal parts each containing 1 percent of the observations. The xth percentile, denoted as \(P_{x}\) is that value below which x percent of values in the distribution fall. It may be noted that the median is the 50th percentile, 25th percentile is first quartile \(Q_{1}\) and 75th percentile is$_{3}$
For raw data, first arrange the n observations in increasing order. Then the xth percentile is given by
\(\mathbf{P}_{\mathbf{x}}\mathbf{=}\left( \frac{\mathbf{x}\left( \mathbf{n + 1} \right)}{\mathbf{100}} \right)^{\mathbf{\text{th}}}\)item
For a frequency distribution the xth percentile is given by following steps
Find cumulative frequencies
Find \(\left( \frac{\text{x.n}}{100} \right)\)
See in the cumulative frequencies, the value just greater than\(\left( \frac{\text{x.n}}{100} \right)\)and then the corresponding class interval is called Percentile class.
Use the following formula
\[\mathbf{P}_{\mathbf{x}}\mathbf{= l +}\frac{\left( \frac{\mathbf{x \times n}}{\mathbf{100}} \right)\mathbf{- cf}}{\mathbf{f}}\mathbf{\times c}\]
Where
\(\mathbf{l}\) = lower limit of the percentile class
\(\mathbf{\text{cf}}\) = cumulative frequency preceding the percentile class
\(\mathbf{f}\) = frequency of the percentile class
\(\mathbf{c}\) = class interval
\(\mathbf{n}\) = total number of observations
Example 4.9: Compute \(\mathbf{P}_{\mathbf{25}}\)and \(\mathbf{P}_{\mathbf{75}}\) for the data 25, 18, 30, 8, 15, 5, 10, 35, 40, 45
Solution 4.9:
First arrange the data in ascending order
5, 8, 10, 15, 18, 25, 30, 35, 40, 45
Here n =10
\(\mathbf{P}_{\mathbf{25}}\mathbf{=}\left( \frac{\mathbf{25}\left( \mathbf{10 + 1} \right)}{\mathbf{100}} \right)^{\mathbf{\text{th}}}\)= 2.75th item
\(P_{25} =\)2.75th item = 2nd item + 0.75(3rd item – 2nd item)
So from the given data \(P_{25}\)= 8+0.75(10– 8) = 9.5
\(\mathbf{P}_{\mathbf{75}}\mathbf{=}\left( \frac{\mathbf{75}\left( \mathbf{10 + 1} \right)}{\mathbf{100}} \right)^{\mathbf{\text{th}}}\)= 8.25th item
i.e. \(P_{75} = \left( 75 \times \frac{10 + 1}{100} \right)^{th}\) = 8.25th item = 8th item + 0.25(9th item –8th item) = 35+0.25(40-35) =36.25
Note: Data in this example is same as Example 3.6; it can be seen that \(P_{25} = Q_{1}\) & \(P_{75} = Q_{3}\) always
4.5.3 Deciles
Deciles are similar to quartiles. But while quartiles are three points that divide an ordered set of data into four quarters, deciles are 9 points that divide an ordered set of data into ten equal parts. The xth decile is denoted as\(\text{d}_{x}\). It may be noted that the median is the 5thdecile.
\(\mathbf{d}_{\mathbf{x}}\mathbf{=}\left( \frac{\mathbf{x}\left( \mathbf{n + 1} \right)}{\mathbf{10}} \right)^{\mathbf{\text{th}}}\)item
For a frequency distribution the xth decile is given by following steps
Find cumulative frequencies
Find \(\left( \frac{\text{x.n}}{10} \right)\)
See in the cumulative frequencies, the value just greater than\(\left( \frac{\text{x.n}}{10} \right)\)and then the corresponding class interval is called decile class.
Use the following formula
\[\mathbf{d}_{\mathbf{x}}\mathbf{= l +}\frac{\left( \frac{\mathbf{x \times n}}{\mathbf{10}} \right)\mathbf{- cf}}{\mathbf{f}}\mathbf{\times c}\]
Where
\(\mathbf{l}\) = lower limit of the decile class
\(\mathbf{\text{cf}}\) = cumulative frequency preceding the decile class
\(\mathbf{f}\) = frequency of the decile class
\(\mathbf{c}\) = class interval
\(\mathbf{n}\) = total number of observations