Skip to content

You are viewing a free preview of this lesson.

Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.

What is Machine Learning

What is Machine Learning

Machine learning (ML) is a branch of artificial intelligence that enables computers to learn from data and make decisions or predictions without being explicitly programmed for every scenario. Instead of writing rules by hand, you provide the machine with examples and let it discover the underlying patterns.


A Brief History

  • 1943 — Warren McCulloch and Walter Pitts create a mathematical model of a biological neuron
  • 1950 — Alan Turing publishes Computing Machinery and Intelligence and proposes the Turing Test
  • 1957 — Frank Rosenblatt builds the Perceptron, the first trainable neural network
  • 1959 — Arthur Samuel coins the term "machine learning" while working on a checkers-playing programme at IBM
  • 1967 — The nearest-neighbour algorithm is developed, enabling basic pattern recognition
  • 1979 — Students at Stanford build the Stanford Cart, a self-driving vehicle using ML
  • 1997 — IBM Deep Blue defeats Garry Kasparov at chess
  • 2006 — Geoffrey Hinton coins the term "deep learning" and demonstrates effective training of deep neural networks
  • 2012 — AlexNet wins the ImageNet competition, sparking the deep learning revolution
  • 2014 — Generative Adversarial Networks (GANs) are introduced by Ian Goodfellow
  • 2016 — Google DeepMind's AlphaGo defeats the world champion Go player Lee Sedol
  • 2017 — The Transformer architecture is published in Attention Is All You Need
  • 2020 — GPT-3 demonstrates remarkable language generation capabilities
  • Today — Machine learning is embedded in virtually every industry — from healthcare and finance to entertainment and transport

What is Machine Learning?

At its core, machine learning is about learning from data. A traditional programme follows explicit instructions written by a developer. A machine learning model, by contrast, is given data and a learning algorithm — the algorithm finds patterns in the data and encodes them into a model that can make predictions on new, unseen data.

Traditional Programming vs Machine Learning

Traditional Programming Machine Learning
Input: Data + Rules Input: Data + Expected Output
Output: Results Output: Rules (a model)
Developer writes the logic Algorithm discovers the logic
Works well for well-defined problems Works well for complex, pattern-rich problems

A Simple Example

Suppose you want to classify emails as spam or not-spam:

  • Traditional approach: Write rules like "if the email contains 'free money', mark as spam"
  • ML approach: Give the algorithm thousands of labelled emails (spam / not-spam), and it learns the patterns that distinguish them

Types of Machine Learning

Machine learning is broadly divided into three categories:

1. Supervised Learning

The algorithm learns from labelled data — each training example has an input and a known correct output. The goal is to learn a mapping from inputs to outputs.

Task Description Example
Classification Predict a discrete category Email spam detection, image recognition
Regression Predict a continuous value House price prediction, temperature forecasting

2. Unsupervised Learning

The algorithm learns from unlabelled data — there are no correct answers provided. The goal is to discover hidden structure in the data.

Task Description Example
Clustering Group similar data points together Customer segmentation, document grouping
Dimensionality Reduction Reduce the number of features while preserving structure Data visualisation, noise removal

3. Reinforcement Learning

The algorithm learns by interacting with an environment — it takes actions, receives rewards or penalties, and learns to maximise cumulative reward over time.

Concept Description
Agent The learner that takes actions
Environment The world the agent interacts with
Reward Feedback signal after each action
Policy Strategy the agent learns to follow

Examples: game-playing AI (AlphaGo), robotics, autonomous driving, recommendation systems.


Key Terminology

Term Definition
Feature An individual measurable property of the data (e.g., age, height, income)
Label / Target The variable you want to predict (e.g., price, category)
Training Set Data used to train the model
Test Set Data held out to evaluate the model's performance on unseen data
Model The mathematical representation learned from data
Hyperparameter A setting configured before training (e.g., learning rate, number of trees)
Overfitting Model memorises training data and performs poorly on new data
Underfitting Model is too simple to capture the patterns in the data
Generalisation The model's ability to perform well on new, unseen data

The Machine Learning Pipeline

Every machine learning project follows a similar workflow:

  1. Define the problem — What are you trying to predict or discover?
  2. Collect data — Gather relevant data from databases, APIs, files, or web scraping
  3. Prepare data — Clean, transform, and engineer features
  4. Choose a model — Select an appropriate algorithm for the task
  5. Train the model — Fit the model to the training data
  6. Evaluate the model — Measure performance on the test set
  7. Tune and improve — Adjust hyperparameters, add features, try different algorithms
  8. Deploy — Put the model into production to make predictions on new data

Python Libraries for Machine Learning

Library Purpose
NumPy Numerical computing and array operations
Pandas Data manipulation and analysis
Matplotlib / Seaborn Data visualisation
Scikit-Learn Classical machine learning algorithms, preprocessing, evaluation
TensorFlow Deep learning framework (Google)
PyTorch Deep learning framework (Meta)
XGBoost / LightGBM Gradient boosting libraries for tabular data

A First ML Example in Python

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

# Load a built-in dataset
iris = load_iris()
X, y = iris.data, iris.target

# Split into training and test sets
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Create and train a model
model = DecisionTreeClassifier(random_state=42)
model.fit(X_train, y_train)

# Make predictions and evaluate
predictions = model.predict(X_test)
print(f"Accuracy: {accuracy_score(y_test, predictions):.2f}")

Real-World Applications

Healthcare

  • Disease diagnosis from medical images
  • Drug discovery and molecular property prediction
  • Patient readmission risk prediction

Finance

  • Credit scoring and loan default prediction
  • Fraud detection in transactions
  • Algorithmic trading strategies

Technology

  • Recommendation engines (Netflix, Spotify, Amazon)
  • Natural language processing (chatbots, translation)
  • Computer vision (facial recognition, autonomous vehicles)

Science

  • Protein structure prediction (AlphaFold)
  • Climate modelling and weather forecasting
  • Particle physics and astronomical surveys

When to Use Machine Learning

Machine learning is not always the right tool. Consider ML when:

  • The problem involves complex patterns that are hard to define with rules
  • You have sufficient labelled or unlabelled data
  • The pattern you want to learn is relatively stable over time
  • A small improvement in accuracy has significant value

Avoid ML when:

  • A simple rule-based system would suffice
  • You do not have enough data
  • The problem requires perfect explainability (some models are "black boxes")
  • The cost of errors is extremely high and no margin for mistakes exists

Summary

Machine learning enables computers to learn from data rather than following explicit instructions. It is broadly divided into supervised learning (learning from labelled data), unsupervised learning (discovering hidden structure), and reinforcement learning (learning through interaction). The ML pipeline involves defining a problem, collecting and preparing data, training and evaluating a model, and deploying it to production. Python and its rich ecosystem — Scikit-Learn, TensorFlow, PyTorch — make machine learning accessible to beginners and experts alike.