Swiftorial Logo
Home
Swift Lessons
AI Tools
Learn More
Career
Resources

System Design FAQ: Top Questions

22. How would you design a Time-Series Database (TSDB)?

A Time-Series Database (TSDB) stores and queries time-stamped data efficiently, such as metrics, logs, or sensor data. TSDBs optimize for insert rate, compression, and range queries.

📋 Functional Requirements

  • Insert large volumes of timestamped data
  • Efficient range scans and aggregation (e.g., avg, sum)
  • Retention policy for old data
  • Downsampling and compaction support

📦 Non-Functional Requirements

  • High write throughput (millions/sec)
  • Low-latency reads over ranges
  • Compression and deduplication

🏗️ Core Components

  • Write API: Accepts time-series points via HTTP/UDP
  • Ingestion Buffer: Stores recent writes in memory (WAL, memtable)
  • Compactor: Merges and compresses old blocks
  • Storage Engine: Appends to columnar store by time (e.g., Parquet, LSM)
  • Query Layer: Supports PromQL/SQL for querying and visualization

🔧 InfluxDB Line Protocol Example


# Format: measurement,tags fields timestamp
cpu,host=web01 usage=0.87 1718070000000000000
        

🧱 Data Model Example (Prometheus)


metric: http_requests_total
labels: {method="GET", handler="/api"}
value: 1468
timestamp: 1718070000000
        

📉 Query Example (PromQL)


rate(http_requests_total{handler="/api"}[5m])
        

📁 Storage Strategy

  • Partition data by metric → time window → block (TSM or Gorilla)
  • Use delta-encoding, XOR compression
  • Files stored in object storage or SSD

🗑️ Retention and Downsampling

  • High-resolution data kept for 7 days
  • Downsampled (e.g., hourly avg) data stored for 6 months

📈 Use Cases

  • Infrastructure monitoring (e.g., Grafana+Prometheus)
  • IoT sensor telemetry
  • Financial tick data ingestion

📌 Final Insight

A TSDB balances write-heavy ingestion with query efficiency. Optimizations include using in-memory buffers, aggressive compaction, and block-oriented storage. Retention and aggregation policies shape scalability and cost.