System Design FAQ: Top Questions

45. How would you design a Rate Limiter for APIs?

A Rate Limiter protects services from abuse by capping how often clients can make requests over a defined window. Common strategies include token bucket, leaky bucket, and fixed/sliding windows.

📋 Functional Requirements

Limit number of requests per user/IP/token
Support fixed or sliding window semantics
Return status 429 when limit is exceeded
Optional: burst allowance (e.g. 100 reqs/min with 20 burst)

📦 Non-Functional Requirements

Low-latency decision (<2ms)
Memory efficient tracking
Horizontally scalable and globally consistent

🔁 Token Bucket Algorithm

Each user has a bucket with max capacity
Tokens refill at fixed rate (e.g., 1/sec)
A request consumes 1 token; if none remain → 429

⚙️ Redis Lua Script (Atomic Limiting)


local tokens_key = KEYS[1]
local rate = tonumber(ARGV[1])
local capacity = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
local ttl = tonumber(ARGV[4])

local last_tokens = tonumber(redis.call("get", tokens_key) or capacity)
local delta = math.floor((now - ttl) * rate)
local tokens = math.min(capacity, last_tokens + delta)

if tokens <= 0 then
  return 0
else
  redis.call("set", tokens_key, tokens - 1)
  return 1
end

🗄️ Fixed Window in PostgreSQL


CREATE TABLE rate_limits (
  key TEXT PRIMARY KEY,
  window_start TIMESTAMP,
  request_count INT
);

-- Reset every minute
-- Or use upserts with conflict on key

🔎 Observability

Blocked requests per second (429 rate)
Top offending IPs/users
Average window utilization

🧰 Tools/Infra Used

Fast Path: Redis + Lua (atomic & fast)
Slow Path: PostgreSQL/DynamoDB fallback
Distributed: Envoy or API Gateway built-in ratelimit support

📌 Final Insight

Rate limiters are crucial for reliability and cost control. Redis is a preferred backend for high-frequency checks, while PostgreSQL or cloud-native limiters provide control-plane enforcement and history tracking.

←→