Grade 9 Mathematics
Grade 9 · Term 3Mathematics

Data Handling

We collect, organise, and represent data using various graphs. We calculate and interpret measures of central tendency (mean, median, mode) and spread (range, quartiles), and draw scatter plots to investigate relationships between two variables.

Week 5

7.1 Collecting and Representing Data

  • Identify data sources, distinguish between populations and samples
  • Represent data in frequency tables, histograms, bar charts, pie charts, and line graphs
  • Identify misuse of statistics in media
🌍

Real-World Connection

Data is everywhere — sports statistics, election polls, weather forecasts, and medical research. The way data is represented dramatically affects how it's perceived. A misleading axis scale on a bar chart can make a 2% increase look like a 200% jump. Being able to critically read graphs is one of the most important life skills in the modern information age.

Definition

Population vs Sample

A population is the entire group being studied. A sample is a representative subset. We use samples when studying the whole population is impractical.

e.g. Population: all Grade 9 learners in SA. Sample: 200 Grade 9 learners from 5 schools.\text{e.g. Population: all Grade 9 learners in SA. Sample: 200 Grade 9 learners from 5 schools.}

Types of Graphs

Graph Type

Best Used For

Bar chart

Comparing discrete categories

Clear visual comparison

Histogram

Continuous data in groups (class intervals)

Shows distribution shape

Pie chart

Showing parts of a whole (percentages)

Easy proportion reading

Line graph

Trends over time

Shows change and direction

Worked Examples

Worked Example

Building a frequency table and histogram

Problem

20 learners' test scores: 45, 52, 67, 73, 81, 58, 62, 70, 55, 48, 76, 83, 91, 64, 57, 69, 74, 88, 42, 79. Group in intervals 40–49, 50–59, 60–69, 70–79, 80–89, 90–99.

Worked Example

Identifying misleading statistics

Problem

A newspaper claims: 'Our product works for 80% of users!' The fine print shows the survey had 10 people and only 8 responded. Identify the issue.

Worked Example

Reading and interpreting a pie chart

Problem

A school survey shows 180 learners' favourite sports: Football 30%, Basketball 25%, Swimming 20%, Athletics 15%, Other 10%. How many learners chose Football and Swimming combined?
Activity — 8 Questions

CAPS Cognitive Level Distribution

L1 · Knowledge2 Q
L2 · Routine Procedures2 Q
L3 · Complex Procedures2 Q
L4 · Problem Solving2 Q
1
L1 · Knowledge1 mark
Name the graph best suited to show how a school's budget is divided among subjects.
2
L1 · Knowledge1 mark
A tally chart shows: Soccer: ||||, Cricket: |||, Swimming: ||. How many learners were surveyed in total?
3
L2 · Routine Procedures4 marks
Data: 8, 12, 15, 9, 11, 14, 8, 17. Create a frequency table and find the range.
4
L2 · Routine Procedures3 marks
A pie chart shows that 72° represents 'Mathematics' in a school's timetable. What percentage of time is given to Mathematics?
5
L3 · Complex Procedures4 marks
A journalist shows a bar chart with y-axis starting at 95 instead of 0, making a 2% increase look enormous. Explain the misleading technique and how to correct it.
6
L3 · Complex Procedures4 marks
Describe how you would collect a representative sample of Grade 9 learners' study hours from your school. What sampling method would you use?
7
L4 · Problem Solving5 marks
30 learners wrote a test. 10 scored in the 70s, 8 in the 60s, 6 in the 80s, 4 in the 50s, 2 in the 90s. Draw a frequency table and calculate the estimated mean using midpoints.
8
L4 · Problem Solving5 marks
Explain why the mean can be misleading as a measure of central tendency for data such as: 5, 6, 7, 8, 50. Calculate both mean and median and state which is more representative.
Weeks 6–7

7.2 Measures of Central Tendency and Spread

  • Calculate and interpret the mean, median, and mode
  • Calculate the range, inter-quartile range, and identify outliers
  • Draw and interpret box-and-whisker plots
🌍

Real-World Connection

Quartiles and box plots are used in medicine, business, and sport. A doctor checking whether a child's growth is normal compares the child to a growth curve built from quartiles (25th, 50th, 75th percentile). The IQR tells you where the 'middle 50%' of data lies — a tight IQR means consistent results, a wide IQR means high variability.

Definition

Measures of Central Tendency

Values that represent the 'centre' of a data set.

Mean: xˉ=xnMedian: middle valueMode: most frequent\text{Mean: } \bar{x} = \frac{\sum x}{n} \qquad \text{Median: middle value} \qquad \text{Mode: most frequent}

Definition

Quartiles

Quartiles divide ordered data into four equal groups. Q1 = lower quartile (25th percentile), Q2 = median (50th), Q3 = upper quartile (75th).

Q2=Median,Q1=median of lower half,Q3=median of upper halfQ_2 = \text{Median}, \quad Q_1 = \text{median of lower half}, \quad Q_3 = \text{median of upper half}

Inter-Quartile Range (IQR)

IQR=Q3Q1\text{IQR} = Q_3 - Q_1

Measures the spread of the middle 50% of data. Less affected by outliers than the range.

Worked Examples

Worked Example

Finding quartiles and drawing a box plot

Problem

Data: 12, 15, 18, 22, 25, 27, 30, 33, 38. Find the five-number summary and describe the spread.

Worked Example

Mean, median, mode — which to use?

Problem

A company has 9 employees with monthly salaries (in R): 8000, 8500, 9000, 9000, 9500, 10000, 10500, 11000, 75000. Find mean, median, and mode. Which best represents a 'typical' salary?

Worked Example

Effect of changing data on measures

Problem

A data set has mean 15 and 6 values. A 7th value of 29 is added. Find the new mean.
Activity — 8 Questions

CAPS Cognitive Level Distribution

L1 · Knowledge2 Q
L2 · Routine Procedures2 Q
L3 · Complex Procedures2 Q
L4 · Problem Solving2 Q
1
L1 · Knowledge2 marks
Find the mean of: 14, 18, 22, 10, 16.
2
L1 · Knowledge2 marks
Find the median of: 3, 7, 9, 1, 5.
3
L2 · Routine Procedures4 marks
Data: 4, 8, 11, 13, 15, 17, 20. Find Q1, Q2, Q3, and IQR.
4
L2 · Routine Procedures3 marks
Six test scores are 55, 60, 70, 75, 80, xx. If the mean is 70, find xx.
5
L3 · Complex Procedures5 marks
Box plot data: Min=10, Q1=15, Median=22, Q3=30, Max=45. Calculate IQR and identify any outlier if a data point of 60 is added.
6
L3 · Complex Procedures4 marks
Class A: mean=72, median=74. Class B: mean=72, median=65. Compare the two distributions and identify which has outliers pulling the data.
7
L4 · Problem Solving5 marks
The IQR of a dataset is 12 and Q3=35. The minimum is 8 and the maximum is 55. Find Q1, Q2 (if median=28), and state whether min and max are outliers.
8
L4 · Problem Solving5 marks
A class of 30 learners writes a test. The mean is 62 and the range is 48. The 6 lowest-scoring learners each scored 28. The teacher awards each of these 6 learners 5 bonus marks. Find (a) the new class mean and (b) the new range.
Week 8

7.3 Scatter Plots and Correlation

  • Draw scatter plots for bivariate data
  • Identify positive, negative, and no correlation from scatter plots
  • Draw a line of best fit and use it to make predictions
🌍

Real-World Connection

Scatter plots reveal relationships between two variables. Does more study time lead to better marks? A scatter plot of 'hours studied' vs 'test score' would show a positive correlation. Do taller people have larger shoe sizes? Again, positive correlation. Climate scientists use scatter plots to show the correlation between CO₂ levels and global temperature. Correlation is the foundation of predictive analytics in business and medicine.

Types of Correlation

Type

Description

Strong positive

Points close to a line rising left to right

As x increases, y increases

Weak positive

Points loosely scattered, general upward trend

Mild positive relationship

Negative

Points rising right to left

As x increases, y decreases

No correlation

No discernible pattern

x and y are unrelated

⚠️ Warning

Correlation does NOT imply causation. Just because two variables are correlated doesn't mean one causes the other. Example: ice cream sales and drowning rates both rise in summer — but ice cream doesn't cause drowning. Both are caused by a third factor: hot weather.

Worked Examples

Worked Example

Drawing and interpreting a scatter plot

Problem

Hours studied: 1, 2, 3, 4, 5, 6. Marks (%) obtained: 45, 52, 61, 67, 75, 83. Plot and describe the relationship.

Worked Example

Using line of best fit to predict

Problem

A scatter plot of shoe size vs. height shows a line of best fit with equation h=8s+120h = 8s + 120 (h in cm, s = shoe size). Predict the height of a person with shoe size 9. Is this interpolation or extrapolation?

Worked Example

Identifying correlation type

Problem

Describe the correlation for: (a) Age of car vs. resale value. (b) Number of hours of sunshine vs. sales of ice cream. (c) Shoe size vs. mathematics mark.
Activity — 8 Questions

CAPS Cognitive Level Distribution

L1 · Knowledge2 Q
L2 · Routine Procedures2 Q
L3 · Complex Procedures2 Q
L4 · Problem Solving2 Q
1
L1 · Knowledge1 mark
Describe the correlation: as hours of TV per day increase, mathematics scores decrease.
2
L1 · Knowledge1 mark
On a scatter plot, points cluster tightly around a line going up from left to right. What type of correlation is this?
3
L2 · Routine Procedures4 marks
Data: x: 2, 4, 6, 8; y: 18, 14, 10, 6. Describe the correlation and estimate the equation of the line of best fit.
4
L2 · Routine Procedures3 marks
A line of best fit for a study-time vs test-score scatter plot is y=5x+40y = 5x + 40 (where xx = hours studied and yy = test score in \%). Predict the score for 7 hours of study. Is this prediction reliable?
5
L3 · Complex Procedures4 marks
A study shows a strong positive correlation between shoe size and reading ability in primary school children. Does this mean large feet cause better reading? Explain.
6
L3 · Complex Procedures4 marks
Seven students' marks in Maths (x) and Science (y): (45,50), (60,55), (55,52), (70,68), (75,72), (80,78), (65,60). Estimate the correlation coefficient as strong/moderate/weak positive/negative/none.
7
L4 · Problem Solving4 marks
The line of best fit through data is y=2x+10y = 2x + 10. One data point is (8,30)(8, 30). Calculate the residual (difference between actual and predicted value) and explain what it means.
8
L4 · Problem Solving4 marks
Why is it dangerous to use a line of best fit to extrapolate far beyond the data range? Give a mathematical and a real-world example.
Data Handling Grade 9 Maths CAPS Notes & Examples | MathSciBuddy