System Design FAQ: Top Questions
61. How would you design a Rate Limiting System?
A Rate Limiting System controls the number of requests users or clients can make in a given time window, helping to prevent abuse, overuse, and service degradation.
๐ Functional Requirements
- Limit requests per user/IP per time unit
- Return appropriate headers and status codes (e.g. 429)
- Support multiple strategies: fixed window, sliding window, token bucket
๐ฆ Non-Functional Requirements
- Low latency and high throughput
- Distributed and fault-tolerant
- Accurate counter consistency
๐ ๏ธ Strategies
- Fixed Window: Count requests per window (e.g., 1 minute)
- Sliding Window Log: Log timestamps and filter old
- Token Bucket: Tokens added over time, requests consume
- Leaky Bucket: Queued requests drain at constant rate
๐งช Example: Redis Fixed Window (per IP)
INCR user:rate:192.168.1.10
EXPIRE user:rate:192.168.1.10 60
๐ก Express.js Middleware
const redis = require("redis").createClient();
const limit = 100;
app.use(async (req, res, next) => {
const key = `ratelimit:${req.ip}`;
const current = await redis.incr(key);
if (current === 1) redis.expire(key, 60);
if (current > limit) {
return res.status(429).send("Rate limit exceeded");
}
next();
});
๐ Considerations
- Global vs per-API limits
- Per-auth-token vs per-IP vs org-level granularity
- Exemptions (internal systems, admin users)
๐ Observability
- Rate-limited requests per minute
- Top IPs hitting limits
- Latency of limit enforcement logic
๐งฐ Tools and Infra
- Store: Redis, Memcached, in-memory (for single-node)
- API Gateway: NGINX, Envoy, Kong built-in limits
- Metrics: Prometheus, Grafana, Datadog
๐ Final Insight
Rate limiting defends APIs from misuse and protects backend services. Use fast-access stores like Redis, expose meaningful headers, and instrument alerts to detect abuse early.