You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
What is Statistics
What is Statistics
Statistics is the science of collecting, organising, analysing, interpreting, and presenting data. It provides the tools to transform raw observations into meaningful conclusions, enabling evidence-based decision-making in virtually every field — from medicine and engineering to business and social policy.
A Brief History
- ~3000 BC — Ancient civilisations in Egypt, Babylon, and China conduct census-like counts for taxation and military planning
- 1662 — John Graunt publishes Natural and Political Observations Made upon the Bills of Mortality, founding modern demography
- 1713 — Jacob Bernoulli publishes Ars Conjectandi, formalising the law of large numbers
- 1805 — Adrien-Marie Legendre introduces the method of least squares for curve fitting
- 1812 — Pierre-Simon Laplace publishes Théorie analytique des probabilités
- 1900 — Karl Pearson develops the chi-squared test
- 1908 — William Sealy Gosset (pen name "Student") publishes the t-distribution
- 1925 — Ronald Fisher publishes Statistical Methods for Research Workers, shaping modern experimental design
- 1933 — Jerzy Neyman and Egon Pearson formalise hypothesis testing with Type I and Type II errors
- 1953 — The bootstrap and computational statistics begin to emerge with growing computer power
- Today — Statistics underpins machine learning, data science, clinical trials, and public policy worldwide
Why Learn Statistics?
1. Data-Driven Decision Making
Statistics turns raw numbers into actionable insights. Whether you are evaluating a new medical treatment, optimising a marketing campaign, or assessing economic policy, statistical reasoning provides the framework for sound decisions.
2. Critical Thinking
Understanding statistics makes you a better consumer of information. You learn to question sample sizes, identify bias, distinguish correlation from causation, and spot misleading charts.
3. Foundation for Data Science and Machine Learning
Modern AI and machine learning algorithms are built on statistical principles — linear regression, Bayesian inference, probability distributions, and hypothesis testing are all core topics.
4. Universal Applicability
Statistics is used in:
| Field | Example Application |
|---|---|
| Medicine | Clinical trials and drug efficacy testing |
| Business | Market research, A/B testing, demand forecasting |
| Engineering | Quality control and reliability analysis |
| Social sciences | Survey analysis, opinion polling |
| Sports | Player performance analytics (sabermetrics) |
| Government | Census data, economic indicators |
Branches of Statistics
Statistics is broadly divided into two major branches:
Descriptive Statistics
Descriptive statistics summarise and organise data so it can be understood at a glance. Common tools include:
- Measures of central tendency (mean, median, mode)
- Measures of spread (range, variance, standard deviation)
- Visual displays (histograms, box plots, bar charts)
Inferential Statistics
Inferential statistics use sample data to make generalisations about a larger population. Key techniques include:
- Estimation (confidence intervals)
- Hypothesis testing (t-tests, chi-squared tests)
- Regression and prediction
Population → Sample → Analyse → Infer back to Population
Key Terminology
| Term | Definition |
|---|---|
| Population | The complete set of all items of interest |
| Sample | A subset of the population selected for analysis |
| Parameter | A numerical measure describing a characteristic of a population (e.g., population mean μ) |
| Statistic | A numerical measure describing a characteristic of a sample (e.g., sample mean x̄) |
| Variable | A characteristic or attribute that can take different values |
| Data | The values collected through observation or measurement |
Types of Data
By Nature
| Type | Description | Examples |
|---|---|---|
| Quantitative | Numerical values that can be measured | Height, weight, temperature, income |
| Qualitative (Categorical) | Labels or categories | Gender, colour, nationality, satisfaction rating |
By Measurement Scale
| Scale | Properties | Examples |
|---|---|---|
| Nominal | Categories with no natural order | Blood type (A, B, AB, O), eye colour |
| Ordinal | Categories with a meaningful order but unequal intervals | Survey ratings (poor, fair, good, excellent) |
| Interval | Numerical with equal intervals but no true zero | Temperature in °C, calendar years |
| Ratio | Numerical with equal intervals and a true zero | Weight, height, income, age |
The Statistical Process
A typical statistical investigation follows these steps:
- Define the question — What do you want to learn?
- Design the study — How will you collect data? (Experiment vs. observational study)
- Collect data — Gather observations using surveys, experiments, or existing records
- Explore and describe — Summarise data with descriptive statistics and visualisations
- Analyse and infer — Apply inferential methods to draw conclusions
- Interpret and communicate — Report findings clearly, noting limitations and uncertainties
Common Pitfalls
Warning: Statistics can be misused — intentionally or accidentally. Watch out for these:
- Selection bias — The sample does not represent the population
- Confounding variables — An unmeasured variable influences both the explanatory and response variables
- Correlation ≠ Causation — Two variables moving together does not mean one causes the other
- Misleading graphs — Truncated axes, cherry-picked scales, or 3D effects that distort perception
- Small sample sizes — Drawing broad conclusions from too few observations
Summary
Statistics is the science of learning from data. It comprises descriptive methods (summarising data) and inferential methods (drawing conclusions about populations from samples). Understanding key terminology — population, sample, parameter, statistic — and the different types of data is essential groundwork for every topic that follows in this course.