In statistics, the mean of a dataset is the mean value. It’s valuable to know due to the fact that it provides us one idea of where the “center” of the dataset is located. The is calculated utilizing the an easy formula:
mean = (sum that observations) / (number of observations)
For example, suppose we have the adhering to dataset:
<1, 4, 5, 6, 7>
The typical of the dataset is (1+4+5+6+7) / (5) = 4.6
But if the typical is a useful and also easy come calculate, that does have actually one drawback: It have the right to be impacted by outliers. In particular, the smaller sized the dataset, the an ext that an outlier could influence the mean.
You are watching: What effect does the outlier have on the mean
To show this, consider the following standard example:
Ten males are sit in a bar. The average income of the ten males is $50,000. Suddenly one male walks out and Bill gates walks in. Currently the average income of the ten guys in the bar is $40 million.
This example shows how one outlier (Bill Gates) can drastically influence the mean.
Small & huge Outliers
An outlier can affect the average by being unusually little or unusually large. In the previous example, Bill gateways had an unusually large income, which brought about the median to it is in misleading.
However, an unusually tiny value have the right to also affect the mean. To illustrate this, take into consideration the complying with example:
Ten students take an exam and also receive the following scores:
<0, 88, 90, 92, 94, 95, 95, 96, 97, 99>
The median score is 84.6.
However, if we remove the “0” score from the dataset, climate the average score becomes 94.
The one unusually low score that one college student drags the median down for the entire dataset.
Sample dimension & Outliers
The smaller sized the sample dimension of the dataset, the much more an outlier has actually the potential to impact the mean.
For example, mean we have a dataset that 100 test scores where all of the students scored at least a 90 or greater except because that one student who scored a zero:
<0, 90, 90, 92, 94, 95, 95, 96, 97, 99, 94, 90, 90, 92, 94, 95, 95, 96, 97, 99, 93, 90, 90, 92, 94, 95, 95, 96, 97, 99, 93, 90, 90, 92, 94, 95, 95, 96, 97, 99, 93, 90, 90, 92, 94, 95, 95, 96, 97, 99, 93, 90, 90, 92, 94, 95, 95, 96, 97, 99, 93, 90, 90, 92, 94, 95, 95, 96, 97, 99, 93, 90, 90, 92, 94, 95, 95, 96, 97, 99, 93, 90, 90, 92, 94, 95, 95, 96, 97, 99, 93, 90, 90, 92, 94, 95, 95, 96, 97, 99>
The median turns the end to be 93.18. If we gotten rid of the “0” indigenous the dataset, the mean would be 94.12. This is a relatively small difference. This shows that even severe outlier only has a little effect if the dataset is big enough.
How to manage Outliers
If you’re worried the an outlier is present in her dataset, you have actually a few options:Make sure the outlier is no the result of a data entrance error. Sometimes an individual merely enters the not correct data value as soon as recording data. If one outlier is present, first verify the the worth was gone into correctly and that that wasn’t one error.Remove the outlier. If the worth is a true outlier, girlfriend may choose to eliminate it if it will have a far-ranging impact on your all at once analysis. Just make certain to mention in your last report or analysis that you eliminated an outlier.
Use the Median
Another way to discover the “center” the a dataset is come use the median, i m sorry is found by arranging all of the individual worths in a dataset from smallest to largest and also finding the center value.
Because the the means it is calculated, the typical is less affected by outliers and it walk a far better job of recording the main location the a circulation when there room outliers present.
See more: Un Cuarto De Balboa Coin 2008 Worth, Panama 1/4 Balboa Km 138 Prices & Values
For example, consider the adhering to chart that mirrors the square footage of dwellings in a specific neighborhood:
The average is heavily influenced by a pair extremely large houses, if the median is not. Thus, the median does a much better job of catching the “typical” square footage of a home in this ar compared come the mean.
Measures of central Tendency – Mean, Median, and also ModeDixon’s Q Test for Detecting OutliersOutlier Calculator