Five-number summary
The five-number summary is a set of descriptive statistics that provides information about a dataset. It consists of the five most important sample percentiles:
- the sample minimum
- the lower quartile or first quartile
- the median
- the upper quartile or third quartile
- the sample maximum
In order for these statistics to exist, the observations must be from a univariate variable that can be measured on an ordinal, interval or ratio scale.
Use and representation
The five-number summary provides a concise summary of the distribution of the observations. Reporting five numbers avoids the need to decide on the most appropriate summary statistic. The five-number summary gives information about the location, spread and range of the observations. Since it reports order statistics the five-number summary is appropriate for ordinal measurements, as well as interval and ratio measurements.It is possible to quickly compare several sets of observations by comparing their five-number summaries, which can be represented graphically using a boxplot.
In addition to the points themselves, many L-estimators can be computed from the five-number summary, including interquartile range, midhinge, range, mid-range, and trimean.
The five-number summary is sometimes represented as in the following table:
Example
This example calculates the five-number summary for the following set of observations: 0, 0, 1, 2, 63, 61, 27, 13.These are the number of known moons of each planet in the Solar System.
It helps to put the observations in ascending order: 0, 0, 1, 2, 13, 27, 61, 63. There are eight observations, so the median is the mean of the two middle numbers, /2 = 7.5. Splitting the observations either side of the median gives two groups of four observations. The median of the first group is the lower or first quartile, and is equal to /2 = 0.5. The median of the second group is the upper or third quartile, and is equal to /2 = 44.
The smallest and largest observations are 0 and 63.
So the five-number summary would be 0, 0.5, 7.5, 44, 63.
Example in R
It is possible to calculate the five-number summary in the R programming language using thefivenum function. The summary function, when applied to a vector, displays the five-number summary together with the mean. The fivenum uses a different method to calculate percentiles than the summary function.Example in Python
This python example uses thepercentile function from the numerical library numpy and works in Python 2 and 3. import numpy as np
def fivenum:
"""Five-number summary."""
return np.percentile
>>> moons =
Example in SAS
You can usePROC UNIVARIATE in SAS to get the five number summary:data fivenum;
input x @@;
datalines;
1 2 3 4 20 202 392 4 38 20
run;
ods select Quantiles;
proc univariate data = fivenum;
output out = fivenums min = min Q1 = Q1 Q2 = median Q3 = Q3 max = max;
run;
proc print data = fivenums;
run;
Example in Stata
input byte y
0
0
1
2
63
61
27
13
end
list
tabstat y, statistics