System Design FAQ: Top Questions

26. How would you design a Rate Limiter System?

A Rate Limiter restricts the number of requests a user or client can make to a system within a defined timeframe. It's essential for DDoS protection, fair usage, and system stability.

📋 Functional Requirements

Limit number of requests per IP/user/token
Configurable rate thresholds (e.g., 100 req/min)
Support global or per-endpoint limits

📦 Non-Functional Requirements

Low latency (~1ms)
Highly available and distributed
Real-time counter sync between nodes

🏗️ Common Algorithms

Token Bucket: Tokens refill at a fixed rate
Leaky Bucket: Processed at a constant rate
Fixed Window Counter: Tracks counts per time window
Sliding Window Log/Counter: Tracks with more granularity

🧪 Redis-Based Token Bucket (Lua Script)


-- rate_limit.lua
local key = KEYS[1]
local max_tokens = tonumber(ARGV[1])
local refill_rate = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
local ttl = tonumber(ARGV[4])

local tokens = tonumber(redis.call("get", key)) or max_tokens
local last_refill = tonumber(redis.call("get", key .. ":ts")) or now
local elapsed = now - last_refill
local refill = math.floor(elapsed * refill_rate)

tokens = math.min(tokens + refill, max_tokens)

if tokens > 0 then
  tokens = tokens - 1
  redis.call("set", key, tokens, "EX", ttl)
  redis.call("set", key .. ":ts", now, "EX", ttl)
  return 1
else
  return 0
end

📤 API Gateway Integration (Kong)


{
  "name": "rate-limiting",
  "config": {
    "second": 5,
    "minute": 100,
    "policy": "redis",
    "redis_host": "redis.internal",
    "redis_port": 6379
  }
}

📋 Headers to Expose

X-RateLimit-Limit: max requests allowed
X-RateLimit-Remaining: remaining requests
X-RateLimit-Reset: time to reset window

📈 Observability & Metrics

Requests limited per user/IP
Token refill failures
High denial rate patterns (potential abuse)

📌 Final Insight

A rate limiter is best built using a combination of efficient data structures, atomic storage (e.g., Redis), and algorithm selection based on business needs. It’s critical to enforce limits at the edge or API gateway level.

←→