Statistics
We collect, organise and summarise univariate data using measures of central tendency (mean, median, mode) and measures of spread (range, quartiles, IQR). We draw and interpret box-and-whisker plots and histograms, and work with both ungrouped and grouped data.
8.1 Measures of Central Tendency & Spread
- Calculate mean, median, mode for ungrouped data; estimate mean for grouped data
- Calculate range, quartiles (Q1, Q2, Q3), IQR and semi-IQR
- Construct a five-number summary and box-and-whisker plot
- Identify and interpret percentiles
Real-World Connection
When newspapers report the 'average salary', they typically mean the mean — which is pulled upward by a few very high earners. The median is usually more representative of what most workers earn. Knowing the difference between these two measures helps you read any statistical report critically.
Mean (ungrouped)
$\sum x_i$ = sum of all values; $n$ = number of data values
Definition
Mode
The value that appears most frequently in a data set. A data set may be unimodal (one mode), bimodal (two modes), or have no mode if all values appear equally often.
Definition
Median
The middle value when data is arranged in ascending order. With n odd: median is the th value. With n even: median is the average of the th and th values.
Interquartile Range
$Q_1$ = lower quartile (median of lower half); $Q_3$ = upper quartile (median of upper half)
Definition
Five-Number Summary
A data summary using five values: Minimum, Q1, Median (Q2), Q3, Maximum. These five values define the box-and-whisker plot.
💡 Tip
Semi-IQR = IQR/2. The p-th percentile is the value below which p% of the data falls. Q1 = 25th percentile, Q2 = 50th percentile (median), Q3 = 75th percentile.
Worked Example
Find mode and compare with mean and median
Problem
Worked Example
Five-number summary and box plot
Problem
Worked Example
Effect of an outlier on mean vs median
Problem
CAPS Cognitive Level Distribution
8.2 Grouped Data & Histograms
- Estimate the mean of grouped data using midpoints of class intervals
- Identify the modal class interval and the class interval containing the median
- Draw and interpret histograms (no gaps between bars; frequency on y-axis)
- Use statistical summaries to analyse data and make comments in context
Real-World Connection
National exam results for millions of students are always reported as grouped data — marks in bands like 0–29%, 30–39%, etc. A histogram reveals at a glance whether most students passed or failed, and where performance clusters. Every education policy decision is guided by reading histograms like these.
Estimated mean of grouped data
$f_i$ = frequency of class $i$; $m_i$ = midpoint of class interval $i$
Definition
Modal class
The class interval with the highest frequency. We cannot identify the exact mode in grouped data.
Definition
Median class
The class interval that contains the th value (cumulative frequency crosses 50%). We cannot find the exact median from grouped data — only the interval.
ℹ️ Note
In a histogram, each bar represents a class interval. The bars touch (no gaps). The area of each bar is proportional to the frequency. When class widths are equal, frequency is proportional to bar height.
Worked Example
Estimate mean from a frequency table
Problem
Worked Example
Find modal class and median class
Problem
CAPS Cognitive Level Distribution