Interquartile range
In descriptive statistics, the interquartile range is a measure of statistical dispersion, which is the spread of the data. The IQR may also be called the midspread, middle 50%, fourth spread, or H‑spread. It is defined as the difference between the 75th and 25th percentiles of the data. To calculate the IQR, the data set is divided into quartiles, or four rank-ordered even parts via linear interpolation. These quartiles are denoted by Q1, Q2, and Q3. The lower quartile corresponds with the 25th percentile and the upper quartile corresponds with the 75th percentile, so IQR = Q3 − Q1.
The IQR is an example of a trimmed estimator, defined as the 25% trimmed range, which enhances the accuracy of dataset statistics by dropping lower contribution, outlying points. It is also used as a robust measure of scale It can be clearly visualized by the box on a box plot.
Use
Unlike total range, the interquartile range has a breakdown point of 25% and is thus often preferred to the total range.The IQR is used to build box plots, simple graphical representations of a probability distribution.
The IQR is used in businesses as a marker for their income rates.
For a symmetric distribution, half the IQR equals the median absolute deviation.
The median is the corresponding measure of central tendency.
The IQR can be used to identify outliers. The IQR also may indicate the skewness of the dataset.
The quartile deviation or semi-interquartile range is defined as half the IQR.
Algorithm
The IQR of a set of values is calculated as the difference between the upper and lower quartiles, Q3 and Q1. Each quartile is a median calculated as follows.Given an even 2n or odd 2n+1 number of values
The second quartile Q2 is the same as the ordinary median.
Examples
Data set in a table
The following table has 13 rows, and follows the rules for the odd number of entries.| i | x | Median | Quartile |
| 1 | 7 | Q2=87 | Q1=31 |
| 2 | 7 | Q2=87 | Q1=31 |
| 3 | 31 | Q2=87 | Q1=31 |
| 4 | 31 | Q2=87 | Q1=31 |
| 5 | 47 | Q2=87 | Q1=31 |
| 6 | 75 | Q2=87 | Q1=31 |
| 7 | 87 | Q2=87 | |
| 8 | 115 | Q2=87 | Q3=119 |
| 9 | 116 | Q2=87 | Q3=119 |
| 10 | 119 | Q2=87 | Q3=119 |
| 11 | 119 | Q2=87 | Q3=119 |
| 12 | 155 | Q2=87 | Q3=119 |
| 13 | 177 | Q2=87 | Q3=119 |
| - | - | Q2=87 | - |
For the data in this table the interquartile range is IQR = Q3 − Q1 = 119 - 31 = 88.
Data set in a plain-text box plot
+−−−−−+−+
* |−−−−−−−−−−−| | |−−−−−−−−−−−|
+−−−−−+−+
+−−−+−−−+−−−+−−−+−−−+−−−+−−−+−−−+−−−+−−−+−−−+−−−+ Number line
0 1 2 3 4 5 6 7 8 9 10 11 12
For the data set in this box plot:
- Lower quartile Q1 = 7
- Median Q2 = 8.5
- Upper quartile Q3 = 9
- Interquartile range, IQR = Q3 - Q1 = 2
- Lower 1.5*IQR whisker = Q1 - 1.5 * IQR = 7 - 3 = 4.
- Upper 1.5*IQR whisker = Q3 + 1.5 * IQR = 9 + 3 = 12.
- Pattern of latter two bullet points: If there are no data points at the true quartiles, use data points slightly "inland" from the actual quartiles.
Distributions
The interquartile range of a continuous distribution can be calculated by integrating the probability density function. The lower quartile, Q1, is a number such that integral of the PDF from -∞ to Q1 equals 0.25, while the upper quartile, Q3, is such a number that the integral from -∞ to Q3 equals 0.75; in terms of the CDF, the quartiles can be defined as follows:where CDF−1 is the quantile function.
The interquartile range and median of some common distributions are shown [|below]
| Distribution | Median | IQR |
| Normal | μ | 2 Φ−1σ ≈ 1.349σ ≈ σ |
| Laplace | μ | 2b ln ≈ 1.386b |
| Cauchy | μ | 2γ |
Interquartile range test for normality of distribution
The IQR, mean, and standard deviation of a population P can be used in a simple test of whether or not P is normally distributed, or Gaussian. If P is normally distributed, then the standard score of the first quartile, z1, is −0.67, and the standard score of the third quartile, z3, is +0.67. Given mean = and standard deviation = σ for P, if P is normally distributed, the first quartileand the third quartile
If the actual values of the first or third quartiles differ substantially from the calculated values, P is not normally distributed. However, a normal distribution can be trivially perturbed to maintain its Q1 and Q2 std. scores at 0.67 and −0.67 and not be normally distributed. A better test of normality, such as Q–Q plot would be indicated here.