Swiftorial Logo
Home
Swift Lessons
AI Tools
Learn More
Career
Resources

System Design FAQ: Top Questions

56. How would you design a Real-Time Chat System like WhatsApp or Slack?

A Real-Time Chat System allows users to send and receive messages instantly, with features like read receipts, typing indicators, presence status, media sharing, and conversation history.

📋 Functional Requirements

  • 1:1 and group messaging
  • Message ordering and delivery guarantees
  • Online/offline presence, typing indicator
  • Support for emojis, media, file uploads

📦 Non-Functional Requirements

  • Low latency & real-time delivery
  • Scalability to millions of concurrent users
  • Durable message storage with retries

🏗️ High-Level Architecture

  • WebSocket Gateway: Maintains persistent connection
  • Message Broker: Kafka, Redis Streams, or NATS
  • Chat Service: Handles routing, storage, and acknowledgments
  • Presence Tracker: Redis or ZooKeeper

📄 Message Schema (JSON)


{
  "msg_id": "uuid123",
  "sender_id": "user1",
  "receiver_id": "user2",
  "type": "text",
  "content": "Hello!",
  "ts": 1718132025,
  "status": "sent"
}
        

📡 WebSocket Server (Node.js Snippet)


wss.on('connection', (socket) => {
  socket.on('message', (msg) => {
    const parsed = JSON.parse(msg);
    routeMessage(parsed); // Publish to broker
  });
});
        

🗂️ Persistence Layer (PostgreSQL)


CREATE TABLE messages (
  msg_id UUID PRIMARY KEY,
  sender_id TEXT,
  receiver_id TEXT,
  content TEXT,
  ts TIMESTAMPTZ,
  status TEXT
);
        

🔔 Notification & Retry Logic

  • Push notification if receiver is offline
  • Retry delivery on reconnect
  • Acknowledgement receipt and state transition

📊 Observability

  • Message delivery latency (p50/p95)
  • WebSocket open/close counts
  • Errors per user or region

🧰 Tools and Infra

  • WebSocket: NGINX + uWSGI, Socket.IO, Phoenix Channels
  • Storage: PostgreSQL, Cassandra, ScyllaDB
  • Queue: Redis Streams, Kafka, RabbitMQ

📌 Final Insight

Real-time messaging requires coordination between persistent connections, scalable message routing, and eventual consistency. Choose appropriate queues and databases based on volume and latency sensitivity.