Skewed distribution is that in which the mean deviates from the peak of the distribution. ScienceStruck tells you about the types of skewed distributions, along with some of their real-life examples, for better understanding.
Did You Know?
The history of skewed distribution can be traced back to the end of the nineteenth century.
In the field of statistics, a data distribution is used to study values belonging to a large population or sample. A population is a large group of items with some similarity between them. Whenever such a distribution is studied, several unique characteristics can be attributed to each distribution. One such characteristic is the symmetry of the distribution. The symmetry shows how the values of the population are arranged around the measures of central tendency, such as the mean, median, or mode.
The measures of central tendency are used as a representative of an entire group of values. They are:
■ The mean is simply the average of all values in the population or sample.
■ The median is the middle value in the distribution, such that there are an equal number of values, both, to its left and right side.
■ The mode is the value which occurs the most number of times.
A symmetric distribution is one in which the mean, median, and mode coincide with each other, and the two halves of the distribution are mirror images of each other. Practically, it is difficult to encounter a symmetric distribution. The distributions most commonly observed are asymmetric or skewed distributions.
An asymmetric distribution is one in which the mean does not coincide with the peak of the distribution, and one of the ‘tails’ of the distribution is longer than the other. The different types of skewed distribution along with some real-life examples are given in the upcoming sections.
Types of Skewed Distributions
Positively Skewed Distribution
A positively skewed distribution is one in which the tail of the distribution shifts towards the right, i.e., it has a tail on the positive direction of the curve. For this reason, it is also called a right skewed distribution. More accurately, a distribution is said to be right skewed if its right tail is longer than its left tail. In this distribution, the mean value is towards the right side of the peak. The reason for this skewness is that the mass of the distribution occurs on the left side of the positively skewed distribution curve. This means that most values of the distribution occur on the left side. Positively skewed distributions are more common than negatively skewed ones.
In a positively skewed distribution, the extreme scores occur on the right side and have a higher magnitude. As a rule, the mean value shifts towards the extreme scores. Since the extreme scores are larger in a right skewed distribution, the mean has a higher value. In fact, in a positively skewed distribution, both the mean and median are greater in value than the mode, and the mean will also be greater than the median value. One way of deciding whether a distribution is positively skewed or negatively skewed, is by the following formula:
Pearson’s Coefficient of Skewness = (Mean – Mode) ÷ Standard deviation
The standard deviation gives the deviation of each value of the distribution from the mean. By this formula, it is clear that the value of Pearson’s Coefficient will be positive for a right skewed distribution, since the mean of such a distribution is greater than its mode. This is one more reason why a right skewed distribution is called a positively skewed distribution.
Distribution of Income
If the distribution of the household incomes of a region is studied, from values ranging between $5,000 to $250,000, most of the citizens fall in the group between $5,000 and $100,000, which forms the bulk of the distribution towards the left side of the distribution, which is the lower side. However, a couple of individuals may have a very high income, in millions. This makes the tail of extreme values (high income) extend longer towards the positive, or right side. Thus, it is a positively skewed distribution.
If a test conducted in a school has a high difficulty level, then most of the students will have a poor-to-average performance in it. This bulk of students will form the maximum part of the distribution, towards the left side of the positively skewed distribution curve. The highest marks in the test will be obtained only by a couple of meritorious students, which forms the right tail of extreme values. The students with very high marks will shift the mean towards the right, making it a positively skewed distribution. In other words, there will be a higher frequency of low scores and a lower frequency of high scores.
Neighborhood Housing Prices
The variation in housing prices is a positively skewed distribution. For example, if a neighborhood has 100 houses, with 99 of them having a price of $100,000, while only one sells at $1,000,000, then the frequency of houses selling at $100,000 will be maximum towards the left side of the distribution, since it is a lower value than $1,000,000. However, the single house priced at $1,000,000 will push the mean higher, and result in a long tail towards the right side, making it a positively skewed distribution.
Negatively Skewed Distribution
A negatively skewed distribution is one in which the tail of the distribution shifts towards the left side,i.e., towards the negative side of the peak. It is also called a left skewed distribution. In this case, the tail on the left side is longer than the right tail. The mean value in this situation lies at the left side of the peak value. A left skewed distribution occurs because the mass of the distribution is shifted towards the right, which means that most of the values occur on the right side of the negatively skewed distribution curve.
In such a distribution, the left tail is the part where the extreme values occur, and these values are smaller in magnitude. Since the mean tends to shift towards the extreme values, it is smaller in magnitude. Both the mean and median are lower than the mode, and in most of such cases, the mean will also be lesser than the median.
For a left skewed distribution, the Pearson’s Coefficient will be negative, because the mean of such a distribution is lower than its mode. This is why such a distribution is called a negatively skewed distribution.
When compared to the example of a difficult test given above, if a school test is easy, then most of the students will perform well in it. This maximum bulk of students will take up the right side of the negatively skewed distribution curve. On the contrary, a few students may perform poorly, and even get very low marks in the test. These extreme values of low magnitude (less marks) extends the tail in the negative or left direction from the distribution, making it a negatively skewed distribution. Here, there is a high frequency of high scores and low frequency of less scores.
When the retirement age of employees is compared, it is found that most retire in their mid-sixties, or older. Thus, the distribution of most people will be near the higher extreme, or the right side. However, there is an increasingly new trend in which very few people are retiring early, and that too at very young ages. This will make the tail of the distribution longer towards the left side or the lower side, and the less values (low ages) will shift the mean towards the left, making it a negatively skewed distribution.
When comparison of human lifespans is done, most people live beyond their middle age, or even older. Thus, the maximum frequency is of such people, which takes up the right side of the distribution, which is the side with higher age values. However, some individuals lose their lives younger, and some even at a very young age. These individuals take up the lowest values, i.e., towards the left side or the negative side of the distribution, making the tail longer at this part.
To sum it up, a positive skew distribution is one in which there are many values of a low magnitude and a few values of extremely high magnitude, while a negative skew distribution is one in which there are many values of a high magnitude with a few values of very low magnitude.