CloudFront Monitoring — Metrics, Alarms & Error Rate Tracking

CloudFront is often the first layer of your stack that users touch — which makes it the most important layer to monitor. A misconfiguration at the CloudFront level can make your entire application unreachable even when your origin (ECS, ALB, S3) is perfectly healthy.

Despite this, CloudFront monitoring is frequently overlooked because “CDN problems” seem rare. In practice, CloudFront issues cause outages more often than people expect — typically after deployments, certificate changes, or cache policy updates.

CloudFront metrics in CloudWatch

CloudFront emits metrics to CloudWatch in the us-east-1 region regardless of where your distribution serves traffic. You must switch to N. Virginia when viewing CloudFront alarms.

The core metrics to monitor:

5xxErrorRate — percentage of requests returning 5xx. This is the most critical signal. Anything above 1% warrants an alarm.
4xxErrorRate — high 4xx rates often indicate origin misconfigurations returning 403 or routing rules pointing to missing resources.
TotalErrorRate — combined 4xx + 5xx. Alarm at 5% for a first warning.
CacheHitRate — a sudden drop indicates caching stopped working, pushing all traffic to origin and increasing latency.
OriginLatency — how long CloudFront is waiting for your origin to respond. Spikes here often precede 5xx errors as origin health degrades.

aws cloudwatch put-metric-alarm \
  --alarm-name "cloudfront-5xx-error-rate-high" \
  --namespace "AWS/CloudFront" \
  --metric-name 5xxErrorRate \
  --dimensions Name=DistributionId,Value=EDFDVBD6EXAMPLE \
  --statistic Average \
  --period 60 \
  --evaluation-periods 3 \
  --threshold 1 \
  --comparison-operator GreaterThanThreshold \
  --alarm-actions arn:aws:sns:us-east-1:123:your-alerts-topic \
  --region us-east-1

Enable additional metrics

By default, CloudFront only emits a subset of metrics. To monitor CacheHitRate,OriginLatency, and 4xxErrorRate, you need to enable additional metrics on each distribution — there's an additional cost (~$10/month per distribution).

aws cloudfront update-distribution-with-staging-config \
  --id EDFDVBD6EXAMPLE

# Or enable via console:
# CloudFront → Distributions → your dist → Monitoring tab → Enable additional metrics

For production distributions serving revenue traffic, this cost is trivial compared to the detection speed it provides.

Origin health vs. CloudFront health

When a CloudFront alarm fires, the first question is: is this a CloudFront problem or an origin problem? The distinction matters because the remediation is different.

Origin returning errors — CloudFront's 5xxErrorRate rises but your ALB metrics show HTTPCode_Target_5XX_Count increasing simultaneously. Fix the origin.
CloudFront misconfiguration — CloudFront shows errors but ALB metrics are clean. Check your cache behaviors, origin settings, and SSL configuration.

A CloudWatch dashboard that shows CloudFront and origin metrics side-by-side makes this diagnosis immediate rather than requiring manual cross-referencing.

Common CloudFront failure modes

Origin connection failures

If CloudFront can't reach your origin (security group blocks CloudFront IPs, ALB listener not configured, origin domain changed), it returns a 502 Bad Gateway or 503. This looks like a 5xx error rate spike with clean ALB metrics — because CloudFront never reaches ALB.

To allow only CloudFront to reach your ALB, use com.amazonaws.global.cloudfront.origin-facing as a managed prefix list in your security group — and verify this after any infrastructure change.

Certificate issues

CloudFront requires that the ACM certificate in us-east-1covers the CNAME aliases configured on the distribution. If the certificate expires or the CN doesn't match, CloudFront serves TLS errors to users. Monitor your ACM certificate's DaysToExpiryand your CloudFront distribution's domain configuration separately.

Cache policy changes causing origin overload

Setting a cache TTL to 0 or changing the cache key to include query strings can cause the cache hit rate to plummet. If your origin isn't sized to handle 100% of traffic, this causes latency spikes and eventually 5xx errors. Watch CacheHitRate after every deployment.

External monitoring: the last line of defence

CloudWatch metrics are sampled and aggregated — they may not catch a 30-second outage that only affects a subset of requests. External monitoring makes a real HTTP/HTTPS request to your public CloudFront URL every 60 seconds, exactly as a user would, and fails immediately on any error.

This is particularly important for catching:

TLS errors that look like connection failures rather than HTTP responses.
CloudFront returning a 403 from a WAF rule with a legitimate-looking response.
Redirect loops that return 301/302 indefinitely.

With PulseRadar, you add your CloudFront domain as a monitor and get a public status page that shows historical uptime, active incidents, and subscriber email notifications — all without any CloudWatch configuration.

CloudFront monitoring checklist

Enable additional metrics on all production distributions.
CloudWatch alarm on 5xxErrorRate > 1% in us-east-1.
CloudWatch alarm on TotalErrorRate > 5%.
CloudWatch alarm on CacheHitRate dropping more than 20% from baseline.
Dashboard comparing CloudFront and ALB origin metrics side-by-side.
ACM certificate expiry alarm at 30 days.
External uptime monitor hitting your CloudFront domain every 60 seconds.