You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Google Cloud Monitoring is a fully managed service within Google Cloud's operations suite (formerly Stackdriver) that provides visibility into the performance, availability, and health of your cloud-powered applications and infrastructure. It collects metrics, events, and metadata from Google Cloud services, hosted uptime probes, and application instrumentation, giving you a unified view of your entire environment.
Running workloads in the cloud introduces a level of dynamism that traditional on-premises monitoring tools were not designed for. Resources scale up and down, services communicate across regions, and deployments happen continuously. Without effective monitoring you are operating blind — unable to detect degradation, understand capacity trends, or respond to incidents before they affect users.
| Benefit | Description |
|---|---|
| Proactive detection | Identify issues before users report them through alerting and anomaly detection |
| Root cause analysis | Correlate metrics, logs, and traces to pinpoint the source of problems quickly |
| Capacity planning | Use historical trends to forecast resource needs and avoid over-provisioning |
| Cost visibility | Track resource utilisation to identify waste and right-size workloads |
| Compliance | Maintain audit trails and demonstrate adherence to operational standards |
Without proper monitoring:
Google Cloud's operations suite provides an integrated set of monitoring, logging, and diagnostics tools:
| Service | Purpose |
|---|---|
| Cloud Monitoring | Collects metrics, creates dashboards, and triggers alerts |
| Cloud Logging | Ingests, stores, and analyses log data at scale |
| Cloud Trace | Distributed tracing for latency analysis across services |
| Cloud Profiler | Continuous CPU and memory profiling of production applications |
| Error Reporting | Aggregates and tracks application errors with stack traces |
Google Cloud Operations Suite
/ | \
Metrics Logs Traces
| | |
Cloud Monitoring Cloud Logging Cloud Trace
| | |
Dashboards Log Explorer Trace Explorer
Alerts Log Router Latency Analysis
Cloud Monitoring sits at the centre of the observability stack. It ingests platform metrics automatically from every Google Cloud resource and supports custom metrics from your applications.
Metrics are numerical measurements collected at regular intervals. Google Cloud automatically collects platform metrics for every resource — such as CPU utilisation for Compute Engine instances, request count for Cloud Run services, and query latency for BigQuery jobs.
| Metric Type | Description | Examples |
|---|---|---|
| Platform metrics | Automatically collected by Google Cloud services | compute.googleapis.com/instance/cpu/utilization |
| Custom metrics | Defined and sent by your application code | custom.googleapis.com/orders/processed |
| External metrics | Ingested from third-party systems via integrations | Datadog, Prometheus, or AWS CloudWatch metrics |
Every metric is associated with a monitored resource — the entity that the metric describes. Examples include gce_instance, cloud_run_revision, gke_container, and cloudsql_database.
Cloud Monitoring uses a scoping project (formerly called a workspace) to define which Google Cloud projects are monitored together. A single scoping project can monitor up to 375 projects, giving you a unified view across your entire organisation.
Google Cloud metrics follow a hierarchical naming convention:
<service>.googleapis.com/<resource>/<metric_name>
For example:
| Metric | Description |
|---|---|
compute.googleapis.com/instance/cpu/utilization | CPU utilisation of a Compute Engine VM |
run.googleapis.com/request_count | Number of requests to a Cloud Run service |
cloudsql.googleapis.com/database/cpu/utilization | CPU utilisation of a Cloud SQL instance |
loadbalancing.googleapis.com/https/request_count | Request count for an HTTPS load balancer |
Understanding this naming convention is essential for building queries, dashboards, and alert policies.
Cloud Monitoring is enabled by default for all Google Cloud projects. To verify or enable it explicitly:
gcloud services enable monitoring.googleapis.com --project=my-project
# List available metric descriptors for Compute Engine
gcloud monitoring metrics-descriptors list \
--filter='metric.type = starts_with("compute.googleapis.com/instance/cpu")'
# Read time-series data for a specific metric
gcloud monitoring time-series list \
--filter='metric.type = "compute.googleapis.com/instance/cpu/utilization"' \
--start-time="2024-01-01T00:00:00Z" \
--end-time="2024-01-02T00:00:00Z"
A well-designed monitoring strategy on GCP covers multiple layers:
Monitor the health of your Google Cloud resources:
Monitor application-level behaviour:
Monitor connectivity and traffic patterns:
Monitor for threats and configuration drift:
Google Cloud Monitoring is the foundation of observability on GCP. It automatically collects platform metrics from every resource, supports custom and external metrics, and provides a unified view across multiple projects through scoping projects. Combined with Cloud Logging, Cloud Trace, Cloud Profiler, and Error Reporting, it forms a comprehensive operations suite for maintaining the health, performance, and security of your cloud workloads. In the next lesson, we will dive deep into metrics and dashboards.