Observability
Monitoring Dashboard
Real-time visibility across every layer — infrastructure, application, and business metrics.
Availability
100%
↑ No incidents
P99 Latency
142ms
↓ 18ms vs avg
Error Rate
0.03%
↑ Within SLO
ECS Tasks
6 / 12
3 AZs, scaling OK
CPU Avg
38%
↓ Below threshold
DB Connections
42 / 200
21% utilization
CPU Utilization (%)Last 30 min
Service Average
Request Rate (req/s)Last 30 min
ALB Target Group
P99 Latency (ms)Last 30 min
X-Ray Trace P99
CloudWatch Alarms
● OKHighCPUUtilization> 80% for 5min38%
● OKHighMemoryUtilization> 85% for 5min52%
● OKALB5xxErrorRate> 1% for 2min0.03%
● OKP99LatencyHigh> 500ms for 3min142ms
● OKRDSCPUHigh> 75% for 5min24%
● OKTaskCountLow< 2 tasks6 tasks
CloudWatch
Custom dashboards, metric alarms, log insights queries, and synthetic canaries.
AWS X-Ray
Distributed tracing across ECS tasks, RDS calls, and external API requests.
CloudWatch Logs
Structured JSON logs aggregated into log groups per service with 90-day retention.
SNS Alerting
Alarm notifications routed to email, Slack webhook, and PagerDuty integration.
GuardDuty
Continuous threat detection for unusual API calls, network anomalies, and credentials.
CloudTrail
Full audit log of all AWS API calls for compliance and forensics.