When comparing distributions, one must compare their shapes, centers, spreads, and unusual features.
When comparing aspects of distributions, one should use comparative language such as “larger", "smaller", "more than", and "less than".
Parallel boxplots allow us to easily compare multiple data sets, but they don’t show gaps.
This lesson builds on students’ previous work in early high school statistics with analysis of distributions based on shape, center, and spread. The task will ask students to take data representing home runs per season for Major League Baseball’s top home run hitters. They will graph the data using a boxplot (modified or standard - there are no outliers in the data set, regardless), and then compile their boxplots. Comparing the distributions, students will construct an argument for the player they believe to be the greatest home run hitter of all time.
Vocabulary: univariate, quantitative, median, interquartile range, quartile 1, quartile 3, approximately symmetric, skewed
- 10 Common boxplot axes (1 per group). Simply take a large sheet of paper or butcher paper and draw/print the same axis on each (we recommend 0, 20, 40, 60, and 80 as axis increments). Axes should line up with each other exactly so boxplots can be assembled side-by-side.
- Data sets for MLB’s home run hitters