System Design FAQ: Top Questions

4. How would you design a Rate Limiter?

A Rate Limiter controls the number of requests a user or client can make to an API within a fixed period. It protects systems from abuse, ensures fair usage, and avoids overloading backend services.

🎯 Use Cases

API gateways (e.g., limit 1000 requests per hour per IP)
Login protection (e.g., 5 failed attempts per minute)
DDOS mitigation and traffic shaping

📦 Rate Limiting Algorithms

Token Bucket: Tokens are added at a fixed rate, consumed per request
Leaky Bucket: Enforces a steady outflow rate (ideal for smooth traffic)
Fixed Window: Limits within defined time chunks (e.g., 1 min)
Sliding Window: More accurate by checking over rolling time intervals

⚙️ Redis-based Fixed Window Rate Limiter (Python)


import time
import redis

r = redis.Redis()
KEY = "rate:127.0.0.1"
LIMIT = 5
WINDOW = 60  # seconds

def is_allowed():
    now = int(time.time())
    key = f"{KEY}:{now // WINDOW}"
    count = r.incr(key)
    if count == 1:
        r.expire(key, WINDOW)
    return count <= LIMIT

# Usage
if is_allowed():
    print("Request allowed")
else:
    print("Rate limit exceeded")

🚦Token Bucket in Go


package main

import (
    "fmt"
    "time"
)

type TokenBucket struct {
    capacity int
    tokens   int
    rate     int           // tokens per second
    lastRefill time.Time
}

func (tb *TokenBucket) Allow() bool {
    now := time.Now()
    elapsed := now.Sub(tb.lastRefill).Seconds()
    tb.tokens += int(elapsed * float64(tb.rate))
    if tb.tokens > tb.capacity {
        tb.tokens = tb.capacity
    }
    tb.lastRefill = now

    if tb.tokens > 0 {
        tb.tokens--
        return true
    }
    return false
}

func main() {
    bucket := &TokenBucket{capacity: 10, tokens: 10, rate: 1, lastRefill: time.Now()}
    for i := 0; i < 15; i++ {
        if bucket.Allow() {
            fmt.Println("Allowed")
        } else {
            fmt.Println("Throttled")
        }
        time.Sleep(200 * time.Millisecond)
    }
}

🧱 Architecture Components

In-memory rate limiters (per instance)
Distributed limiters via Redis, Memcached
Edge/API Gateway integration (Kong, Envoy, AWS API Gateway)

☁️ Cloud Examples

AWS API Gateway: Use usage plans + throttling settings
Cloudflare: Rules by IP, path, or header
Google Cloud: Recommends Cloud Armor with rate-based rules

🧪 Observability Tips

Log rate-limited events
Use metrics: # of throttled requests, IPs blocked
Expose health endpoints that skip rate limiting

📌 Final Insight

Rate limiting is essential for controlling access, preventing abuse, and ensuring fair usage. Whether implemented at the code level or via API gateway, it’s a key part of resilient system architecture.

←→