system design
System Design: Monitoring Platform
Design a metrics collection and alerting system like Datadog.
System Design: Monitoring Platform
The Problem
Design a metrics collection and alerting system for a company with 500 microservices. Each service emits 100 metrics at 10-second intervals. Users need dashboards and configurable alerts.
Scale Math
Architecture
Services → Agent (StatsD/OTel) → Kafka → Ingestion Workers
↓
Time-Series DB (InfluxDB / TimescaleDB)
↓
Query API → Dashboard UI
↓
Alert Evaluator → PagerDuty / SlackKey Decisions
Your design notes
Work through this problem yourself before reading the walkthrough above. Your notes are stored locally and not submitted anywhere — only sent to the AI when you click Review.