Statistics Simplified is the series to express statistics in layman terms.
Any data set should be analyzed for its central tendency and variation. Why variation? What benefits will we get by looking at variation?
Let us consider this scenario: you have come across a river which can be crossed on foot as there is no bridge. You do not know swimming, and the current in the river is calm. There is a board at the river’s bank denoting the average depth as 3 feet.
You are 5.8 feet tall.
Will you cross the river?
In our day-to-day lives, we usually look at the average for performance comparison and decision-making.
It is a major flaw of our thought process as we ignore another critical aspect of data property: variation.
And we call such thought process “Flaw of Averages”.
Had there been additional details like maximum depth: 8 ft., would you have crossed the river?
Considering the variation in the data helps in the wiser decision.
What is the variation?
It is a measurement of the distance between the data points within a given data set.
Lower the better
Low variation implies:
Performance is efficient and better managed.
Less probability of outlier’s in the performance.
Better prediction of future values
Measures of Variation
Popular ways to measure variations are Standard Deviation, Inter-Quartile Range (IQR), and Range.
Range: Difference between maximum & minimum value.
Standard Deviation: Average distance of data points from each other.
Inter Quartile Range (IQR): Difference between 75th percentile and 25th percentile, where percentile is the position of data points when arranged in an order. The Median is the 50th percentile.
Also, see Central Tendency