Swiftorial Logo
Home
Swift Lessons
AI Tools
Learn More
Career
Resources

System Design FAQ: Top Questions

17. How would you design a Logging and Monitoring System?

A Logging and Monitoring System collects, stores, queries, and visualizes logs and metrics to aid in debugging, observability, and real-time alerting for distributed applications.

📋 Functional Requirements

  • Collect structured and unstructured logs
  • Visualize metrics over time
  • Set up alerting thresholds and anomaly detection
  • Enable search across distributed systems

📦 Non-Functional Requirements

  • High write throughput and availability
  • Log retention and archiving
  • Secure and role-based access

🏗️ System Components

  • Log Shippers: Fluentd, Logstash, Vector
  • Metrics Exporters: Prometheus Node Exporter, custom collectors
  • Storage Layer: Loki, Elasticsearch, InfluxDB, VictoriaMetrics
  • Dashboard UI: Grafana, Kibana

📂 Log Format (JSON)


{
  "timestamp": "2025-06-11T13:00:00Z",
  "level": "ERROR",
  "service": "billing-service",
  "message": "Payment failed",
  "user_id": "u5678"
}
        

📦 Prometheus Exporter (Go)


http.Handle("/metrics", promhttp.Handler())
prometheus.MustRegister(requestCount)
requestCount.WithLabelValues("GET", "/api/pay").Inc()
        

🔧 Fluent Bit Config Example


[INPUT]
  Name tail
  Path /var/log/app/*.log
  Tag app.logs
  Parser json

[OUTPUT]
  Name  es
  Match *
  Host  elasticsearch.local
  Port  9200
        

📈 Grafana Dashboard Setup

  • Data sources: Loki (logs), Prometheus (metrics)
  • Panels: Errors per second, request latency histogram
  • Alerts: If error count > 5 in 1 min, notify Slack

🛡️ Access Control

  • Read/write policies via Grafana teams
  • JWT or OAuth integration for secure auth

📚 Retention Policy

  • Logs retained for 7 days, archived in S3 after 30 days
  • Cold storage with query delay > 5s

📌 Final Insight

Modern observability stacks separate logs, metrics, and traces but unify them at the dashboard layer. Use structured logs for richer filtering and tag your metrics with high-cardinality dimensions cautiously.