Swiftorial Logo
Home
Swift Lessons
AI Tools
Learn More
Career
Resources

System Design FAQ: Top Questions

13. How would you design a Real-Time Analytics System?

A Real-Time Analytics System collects, processes, aggregates, and displays metrics and logs within seconds of data generation. This is vital for dashboards, alerting, fraud detection, etc.

๐Ÿ“‹ Functional Requirements

  • Ingest event data from multiple sources (web, mobile, backend)
  • Stream processing and aggregation
  • Queryable analytics dashboard

๐Ÿ“ฆ Non-Functional Requirements

  • Sub-second or near real-time latency
  • Horizontal scalability and backpressure handling
  • Durable, fault-tolerant data pipeline

๐Ÿ—๏ธ Architecture Components

  • Producers: Client SDKs and backend services
  • Ingestion: Kafka, Amazon Kinesis, or Pub/Sub
  • Stream Processing: Apache Flink, Spark Streaming, Kafka Streams
  • Data Store: Druid, ClickHouse, or BigQuery for OLAP-style queries
  • Dashboard: Superset, Grafana, or custom UI

๐Ÿ“ค Example Event Schema (JSON)


{
  "event_type": "page_view",
  "timestamp": "2025-06-11T12:00:00Z",
  "user_id": "u1234",
  "page": "/pricing",
  "device": "mobile",
  "country": "US"
}
        

๐Ÿงช Kafka Topic Config (Example)


# Topic: user-events
cleanup.policy=compact
retention.ms=86400000
compression.type=snappy
num.partitions=12
        

โš™๏ธ Flink SQL Query Example


SELECT
  TUMBLE_START(event_time, INTERVAL '1' MINUTE) AS window_start,
  COUNT(*) AS page_views
FROM page_events
GROUP BY TUMBLE(event_time, INTERVAL '1' MINUTE);
        

๐Ÿ—ƒ๏ธ Druid Ingestion Spec (Partial)


{
  "type": "kafka",
  "dataSchema": {
    "dataSource": "realtime_views",
    "granularitySpec": {
      "type": "uniform",
      "segmentGranularity": "HOUR",
      "queryGranularity": "MINUTE"
    },
    "timestampSpec": {
      "column": "timestamp",
      "format": "iso"
    }
  }
}
        

๐Ÿ” Visualization (Grafana)

  • Data source: ClickHouse or Druid plugin
  • Panels: Total events, active users, latency percentiles
  • Alerts: Page view drops or spike detection

๐Ÿ“Š Observability

  • Event ingestion lag (Kafka lag metrics)
  • Streaming job failures or throughput drops
  • Query response latency spikes

๐Ÿ“Œ Final Insight

Real-time analytics systems require a balance between throughput and latency. Durable event ingestion, scalable stream processing, and columnar stores like ClickHouse or Druid enable powerful, low-latency insights.