Swiftorial Logo
Home
Swift Lessons
AI Tools
Learn More
Career
Resources

System Design FAQ: Top Questions

9. How would you design a Real-Time Chat System (like WhatsApp or Slack)?

A Real-Time Chat System facilitates instant message exchange between users or groups with high reliability, low latency, and support for multimedia, typing indicators, and delivery acknowledgments.

📋 Functional Requirements

  • 1:1 and group messaging
  • Message delivery & read receipts
  • Typing indicators
  • Online/offline presence

📦 Non-Functional Requirements

  • Low latency (<100ms)
  • High availability and fault tolerance
  • Scalable to millions of concurrent connections

🏗️ Architecture Overview

  • Frontend: Web/Mobile clients using WebSockets or long-polling
  • Gateway: Handles auth, routing, and user sessions
  • Message Broker: Kafka/PubSub to decouple producers and consumers
  • Chat Service: Core messaging logic
  • Message Store: DB for chat history (Cassandra, MongoDB)

📤 WebSocket Communication (Node.js)


const WebSocket = require('ws');
const wss = new WebSocket.Server({ port: 8080 });

wss.on('connection', function connection(ws) {
  ws.on('message', function incoming(message) {
    console.log('received:', message);
    ws.send('echo: ' + message);
  });

  ws.send('Welcome to chat server!');
});
        

🗃️ Message Schema (MongoDB)


{
  "message_id": "uuid",
  "sender_id": "user123",
  "receiver_id": "user456",
  "timestamp": "2024-06-10T18:00:00Z",
  "message_type": "text", // or image, video
  "payload": "Hello there!",
  "status": "delivered" // or sent/read
}
        

📈 Delivery Semantics

  • Send → Ack → Delivered → Read: Phased state tracking
  • Message Queue: Retry logic for offline users
  • Store and forward: Buffer undelivered messages in Redis or Kafka

☁️ Redis for Presence Management


SET user:123:online 1 EX 30
GET user:123:online → 1 (online) or nil (offline)
        

👥 Group Chat Considerations

  • Fan-out to all members using Kafka or topic queues
  • Deduplication of message delivery
  • Limit max group size or shard large groups

📊 Observability

  • Track message latency (send → delivered → read)
  • Monitor WebSocket uptime and errors
  • Log undelivered messages for recovery

🔐 Security

  • Token-based WebSocket authentication (JWT)
  • Encrypt messages at-rest and in-transit
  • Rate limiting for spam detection

📌 Final Insight

A real-time chat system is a classic case of low-latency, stateful infrastructure. Resilience comes from decoupling, tracking message state transitions, and using pub/sub messaging and database fallback.