You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Monitoring is the practice of collecting, analysing, and acting on data about your systems so that you can detect problems before they affect users, understand how your infrastructure behaves under load, and make informed decisions about capacity and cost. On AWS, monitoring is not an afterthought — it is a first-class concern baked into every service.
Imagine you deploy a web application on a fleet of EC2 instances behind an Application Load Balancer. Traffic is light at first, but a marketing campaign drives a sudden spike. Without monitoring you might not notice that:
By the time customers complain, you have already lost revenue and trust. Monitoring closes the feedback loop between what your infrastructure is doing and what you think it is doing.
Google's Site Reliability Engineering book popularised four golden signals that apply just as well to AWS workloads:
| Signal | What It Measures | AWS Example |
|---|---|---|
| Latency | Time to serve a request | ALB target response time |
| Traffic | Demand on the system | Requests per second to API Gateway |
| Errors | Rate of failed requests | HTTP 5xx count on CloudFront |
| Saturation | How "full" a resource is | RDS CPU utilisation at 90 % |
If you instrument these four signals for every workload, you will catch the vast majority of production issues.
Monitoring on AWS is not a single tool — it is a spectrum of complementary capabilities:
Numeric time-series data points. Examples: CPU utilisation, request count, queue depth. Amazon CloudWatch is the central metrics service on AWS. Every AWS service publishes metrics to CloudWatch automatically.
Detailed textual records of events. Examples: application log lines, VPC Flow Logs, Lambda invocation logs. CloudWatch Logs is the managed log aggregation and query service.
End-to-end request paths through distributed systems. AWS X-Ray captures traces so you can see how a request flows from API Gateway through Lambda to DynamoDB and back.
Real-time notifications when something changes or crosses a threshold. CloudWatch Alarms trigger SNS topics, Auto Scaling actions, or Lambda functions when a metric breaches a limit.
A record of who did what and when. AWS CloudTrail logs every API call made against your account, giving you an audit trail for security and compliance.
| Service | Primary Purpose | Key Feature |
|---|---|---|
| Amazon CloudWatch | Metrics, logs, alarms, dashboards | Unified operational data from 70+ AWS services |
| AWS CloudTrail | API activity auditing | Records every API call for governance and compliance |
| AWS X-Ray | Distributed tracing | Visualises request paths across microservices |
| Amazon EventBridge | Event-driven automation | Routes events from AWS services, SaaS, and custom apps |
| AWS Config | Resource configuration tracking | Continuous evaluation of resource compliance |
| VPC Flow Logs | Network traffic logging | Captures IP traffic metadata for analysis |
You do not need to master every service on day one. In this course we will focus on CloudWatch (metrics, logs, alarms, dashboards), CloudTrail, and X-Ray because they form the monitoring backbone of almost every AWS workload.
You will often hear the term "observability" alongside monitoring. The distinction is subtle but important:
Observability requires rich, high-cardinality data — detailed logs, distributed traces, and custom metrics. AWS provides the building blocks; your job is to instrument your applications to emit the right data.
When you combine all three pillars you can move from reactive firefighting to proactive, data-driven operations.
AWS publishes platform-level metrics for managed services automatically. For example, RDS exposes CPU utilisation, free storage space, and read/write IOPS without any configuration. However, AWS cannot see inside your application. You are responsible for:
Think of it as two layers:
| Layer | Responsibility | Example |
|---|---|---|
| Infrastructure | AWS provides built-in metrics | EC2 CPUUtilization, RDS FreeableMemory |
| Application | You instrument your code | Order processing time, payment success rate |
Monitoring has a cost. CloudWatch charges per metric, per alarm, per GB of log data ingested, and per query run. Before you instrument everything, plan a monitoring strategy that balances visibility against expense:
We will cover cost-effective monitoring patterns throughout this course.
Monitoring is the foundation of reliable cloud operations. AWS provides a rich suite of services — CloudWatch for metrics, logs, and alarms; CloudTrail for API auditing; X-Ray for distributed tracing — that together give you deep visibility into your workloads. Understanding the four golden signals, the three pillars of observability, and the shared responsibility model for monitoring will prepare you for the hands-on lessons that follow.
In the next lesson we will dive into Amazon CloudWatch Metrics and Dashboards — the starting point for every AWS monitoring journey.