trimmed mean


Let x1,x2,,xn be a set of real-valued data observations. Let x(1)x(2)x(n) be the order statisticsMathworldPlanetmath of the observations. The kth trimmed mean x¯k is defined as:

x¯k=x(k+1)+x(k+2)++x(n-k)n-2k=1n-2ki=k+1n-kx(i).

By ordering the original observations, and taking away the first k smallest observations and the first k largest obserations, the trimmed mean takes the arithmetic averageMathworldPlanetmath of the resulting data. The idea of a trimmed mean is to eliminate outliers, or extreme observations that do not seem to have any logical explanations in calculating the overall mean of a population.

For example, suppose 10 new lightbulbs are drawn from a population of 100 to find the average lifetime of a typical lightbulb, measured in number of hours. The measurements are 802, 854, 823, 428, 815, 840, 833, 809, 843, 821. The (arithmetic) mean of the measurement is

802+854+823+428+815+840+833+809+843+82110=786.8,

with sample standard deviationMathworldPlanetmath = 127.1, whereas the 1st trimmed mean gives:

802+823+815+840+833+809+843+8218=823.25,

with sample standard deviation = 14.6, greatly reducing the sample deviation.

The trimmed mean gives a much more robust estimation (an estimation not greatly affected by outliers) of the average than the arithmetic mean.

Another robust estimtor of a mean is the winsorized mean. Like the trimmed mean, the winsorized mean eliminates the outliers at both ends of an ordered set of observations. Unlike the trimmed mean, the winsorized mean replaces the outliers with observed values, rather than discarding them. The formal definition of the kth winsorized mean wk is:

wk=(k+1)x(k+1)+x(k+2)++x(n-k-1)+(k+1)x(n-k)n=kx(k+1)+(n-2k)x¯k+kx(n-k)n.

From the definition, we see that the winsorized mean is the average of the observations where the first k smallest values are replaced by the k+1th smallest value, x(k+1), and the first k largest values are replaced by the k+1th largest value, x(n-k).

From the above example, the 1st winsorized mean is

802+843+823+802+815+840+833+809+843+82110=823.1,

with sample standard deviation = 16.1, fairly close to the answer given by the trimmed mean.

Title trimmed mean
Canonical name TrimmedMean
Date of creation 2013-03-22 14:42:02
Last modified on 2013-03-22 14:42:02
Owner CWoo (3771)
Last modified by CWoo (3771)
Numerical id 5
Author CWoo (3771)
Entry type Definition
Classification msc 62F10
Classification msc 62F35
Defines winsorized mean
Defines outlier
Defines robust estimation