Availability
100%
↑ No incidents
P99 Latency
142ms
↓ 18ms vs avg
Error Rate
0.03%
↑ Within SLO
ECS Tasks
6 / 12
3 AZs, scaling OK
CPU Avg
38%
↓ Below threshold
DB Connections
42 / 200
21% utilization
CPU Utilization (%)Last 30 min
Service Average
Request Rate (req/s)Last 30 min
ALB Target Group
P99 Latency (ms)Last 30 min
X-Ray Trace P99

CloudWatch Alarms

● OKHighCPUUtilization> 80% for 5min38%
● OKHighMemoryUtilization> 85% for 5min52%
● OKALB5xxErrorRate> 1% for 2min0.03%
● OKP99LatencyHigh> 500ms for 3min142ms
● OKRDSCPUHigh> 75% for 5min24%
● OKTaskCountLow< 2 tasks6 tasks
📊

CloudWatch

Custom dashboards, metric alarms, log insights queries, and synthetic canaries.

🔍

AWS X-Ray

Distributed tracing across ECS tasks, RDS calls, and external API requests.

📋

CloudWatch Logs

Structured JSON logs aggregated into log groups per service with 90-day retention.

📣

SNS Alerting

Alarm notifications routed to email, Slack webhook, and PagerDuty integration.

🕵

GuardDuty

Continuous threat detection for unusual API calls, network anomalies, and credentials.

📑

CloudTrail

Full audit log of all AWS API calls for compliance and forensics.