You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
This lesson introduces the fundamental building blocks of statistics — understanding the different types of data and the methods used to collect them. A clear grasp of data classification and sampling is essential for the AQA GCSE Mathematics Statistics topic and frequently appears in exam questions worth 2–4 marks.
Data can be classified in several ways. The first distinction is between qualitative and quantitative data.
| Type | Definition | Examples |
|---|---|---|
| Qualitative | Data that describes qualities or characteristics (non-numerical) | Eye colour, favourite subject, type of transport |
| Quantitative | Data that can be measured or counted (numerical) | Height, number of siblings, temperature |
Quantitative data is further divided into two types:
| Type | Definition | Examples |
|---|---|---|
| Discrete | Data that can only take specific values (usually whole numbers from counting) | Number of pets, shoe size, dice score |
| Continuous | Data that can take any value within a range (usually from measuring) | Height (1.65 m), weight (72.3 kg), time (14.7 seconds) |
Exam Tip: A common exam question asks you to classify data. Remember — if you count it, it is discrete; if you measure it, it is continuous. Shoe size is a classic trick question: although it has half sizes (5.5, 6, 6.5), it is still discrete because it can only take specific values, not any value in a range.
Data can also be classified by how it was collected.
| Type | Definition | Advantages | Disadvantages |
|---|---|---|---|
| Primary data | Data you collect yourself for a specific purpose | Tailored to your needs; you know how it was collected | Time-consuming and expensive to collect |
| Secondary data | Data collected by someone else, often for a different purpose | Quick and cheap to obtain | May not exactly match your needs; may be out of date or biased |
Examples of primary data: surveys, experiments, questionnaires, observations.
Examples of secondary data: government statistics, newspaper reports, internet databases, school records.
In statistics:
We use samples because it is usually impractical (too expensive, too time-consuming) to survey an entire population.
A good sample should be:
Exam Tip: If a question asks you to criticise a sampling method, check whether the sample is biased (certain groups are excluded or over-represented), too small, or unrepresentative of the population.
There are several methods for selecting a sample. You need to know the following five:
Every member of the population has an equal chance of being selected. Names or numbers are drawn at random (e.g. using a random number generator, pulling names from a hat).
Members are selected at regular intervals from an ordered list (e.g. every 10th person on a register).
The population is divided into groups (strata) based on a characteristic (e.g. age, gender, year group). A random sample is then taken from each group, in proportion to the size of that group in the population.
The number to sample from each stratum is calculated using:
Number from stratum = (number in stratum / total population) x sample size
A school has the following students:
| Year Group | Number of Students |
|---|---|
| Year 7 | 180 |
| Year 8 | 160 |
| Year 9 | 200 |
| Year 10 | 150 |
| Year 11 | 110 |
| Total | 800 |
A stratified sample of 80 students is needed.
Year 7: (180 / 800) x 80 = 18 students
Year 8: (160 / 800) x 80 = 16 students
Year 9: (200 / 800) x 80 = 20 students
Year 10: (150 / 800) x 80 = 15 students
Year 11: (110 / 800) x 80 = 11 students
Check: 18 + 16 + 20 + 15 + 11 = 80 (correct)
The researcher decides how many people from each group to include (sets a quota) and then selects people until each quota is filled. Unlike stratified sampling, the selection within each group is not random.
The researcher simply surveys whoever is easiest to reach or most readily available.
Exam Tip: In AQA exams, stratified sampling calculation questions are very common. Always show the fraction (stratum size / total population) multiplied by the sample size. Round to the nearest whole number if necessary, and always check that your values add up to the required sample size.
Bias occurs when a sample does not fairly represent the population, leading to misleading results.
Common sources of bias include:
graph TD
A[Sources of Bias] --> B[Selection Bias]
A --> C[Question Bias]
A --> D[Response Bias]
A --> E[Non-response Bias]
A --> F[Timing Bias]
B --> B1[Certain groups excluded from sample]
C --> C1[Leading or confusing questions]
D --> D1[People lie or exaggerate answers]
E --> E1[Some groups do not respond]
F --> F1[Data collected at unrepresentative time]
Exam Tip: When asked to suggest improvements to a data collection method, always consider whether the sample is large enough, whether it is representative, and whether any groups have been excluded. Mentioning specific sources of bias will gain you marks.