Header

Comparing Results Generated by Quartile Calculation Algorithms

Your browser is ignoring the <APPLET> tag!

The Experiment

The Java applet above illustrates the differences among the results generated by the quartile calculation algorithms. It implements an experiment that consists of taking 50,000 simple random samples of a specified size. For each sample, the first and third quartiles are calculated using each of the three quartile calculation algorithms resulting in six sets of sample quartiles:

A five-number summary is generated for each of these sets of sample quartiles and displayed as a boxplot.

The samples are drawn from one of two populations. The values in the first population are normally distributed with μ=0 and σ =1 (the standard normal distribution). The first quartile for this population is -0.67449 and the third quartile is 0.67449. For a normal population, the distribution of sample quartiles (first or third) is approximately normal.

The values in the second population are uniformly distributed within the interval zero to one (0 ≤ x < 1). The first quartile is 0.25 and the third quartile is 0.75. For a uniform population, the distribution of sample quartiles (first or third) is slightly skewed toward the median.

General Conclusions

The length = n-1 algorithm tends to yield Q1 values that are too high and Q3 values that are too low. The length = n algorithm tends to yield Q1 values that are a bit too high and Q3 values that are a bit too low. Nevertheless, this algorithm tends to yield the most accurate results. The length = n+1 algorithm, on the other hand, tends to yield Q1 values that are too low and Q3 values that are too high. As one would expect, the differences among the results diminish as the sample size increases.