** Mean vs Median vs Mode **

Mean, median, and mode are the primary * measures of central tendency* used in descriptive statistics. They are completely different from each other and cases in which they are used to summarize the data are also different.

**Mean**

The arithmetic mean is the sum of the data values divided by the number of data values, i.e.

`[latex]\bar{x} = \frac{1}{n}\sum_{i=1}^{n}x_{i} = \frac{x_{1}+x_{2}+x_{3}+...+x_{n}}{n}[/latex] `

If the data is from a sample space it is called a sample mean ([latex]\bar{x} [/latex]), which is a descriptive statistic of the sample. Although it is the most commonly used descriptive measure for a sample, it is not a robust statistic. It is very sensitive to the outliers and oscillations.

For example, consider the average income of the citizens of a particular city. Since all the data values are summed and then divided, the income of an extremely wealthy person affects the mean significantly. Therefore, the mean values are not a good representation of the data always.

Also, in the case of an alternating signal, the current passing through an element periodically varies from the positive direction to negative direction and vice versa. If we take the average current passing through the element in a single period, it will give a 0, meaning that no current has passed through the element, which obviously is not true. Therefore, in this case too, arithmetic mean is not a good measure.

The arithmetic mean is a good indicator when the data is evenly distributed. For a normal distribution, the mean is equal to the mode and median. It also has the lowest residuals when considering the root mean squared error; therefore, the best descriptive measure when it is required to represent a dataset by a single number.

**Median**

The values of the middle data point after arranging all the data values in ascending order is defined as the median of the dataset. Median is the 2nd quartile, 5th decile and 50th percentile.

*• If the number of observations (data points) is odd, then the median is the observation exactly in the middle of the ordered list.*

*• If the number of observations (data points) is even, then the median is the mean of the two middle observations in the ordered list.*

Median divides the observation into two groups; i.e. a group (50%) of values higher and a group (50%) of values lower than the median. Medians are specifically used in skewed distributions and represent data fairly better than the arithmetic mean.

**Mode**

Mode is the most occurring number in a set of observations. Mode of a Data Set is calculated by finding the frequency of each element within the set.

*• If no value occurs more than once, then the data set has no mode.*

*• Otherwise, any value that occurs with the greatest frequency is a mode of the data set.*

More than 1 mode can exist in a set; therefore, mode is not a unique statistic of a dataset. In a uniform distribution, there is one mode. The mode of a discrete probability distribution is the point where the probability mass function reaches its highest point. Rendering from above interpretations, we can say that * global maxima* are modes.

Consider the application of all three measures to the following data set.

*DATA: {1, 1, 2, 3, 5, 5, 5, 5, 6, 6, 8, 8, 9, 9, 9, 9, 9, 10, 10, 10, 14, 14, 15, 15, 15}*

*Mean = (1+ 1+ 2+ 3+ 5+ 5+ 5+ 5+ 6+ 6+ 8+ 8+ 9+ 9+ 9+ 9+ 10+ 10+ 10+ 14+ 14+ 15+ 15+ 15) / 25 = 8.12*

*Median = 9 (13th element)*

*Mode = 9 (frequency of 9 = 5)*

**What is the difference between Mean, Median and Mode?**

• Arithmetic mean is the sum of the values (observations) divided by the number of observations. It is not a robust statistic, and heavily dependent on the normal distribution nature within the distribution considered. A single outlier may cause a significant shift in the mean giving relatively misleading values. The concept can be extended to geometric mean, harmonic mean, weighted mean and so on.

• Median is the middle values of the set of observations, and it is relatively less affected by outliers. It may give a good estimation as the summary statistic in highly skewed cases.

• Mode is the most common observation values in the dataset. If the distribution is positive skewed, the mode lies left to the median and, if negatively skewed, the mode lies right to the median.

• If positively skewed, mean is right to the median; if negatively skewed mean is to the left of the median.

• In the normal distribution, all three, mean, mode and median are equal.

## Leave a Reply