Swiftorial Logo
Home
Swift Lessons
Tutorials
Learn More
Career
Resources

System Design FAQ: Top Questions

36. How would you design a Rate Limiter Service?

A Rate Limiter restricts how often a client can make requests to a server in a defined window. It prevents abuse, overload, and ensures fair usage across users or APIs.

๐Ÿ“‹ Functional Requirements

  • Define request limit per user, IP, or API key (e.g., 100 requests/min)
  • Enforce limits with near-zero latency
  • Support burst allowance and quota renewal
  • Expose API to retrieve remaining quota

๐Ÿ“ฆ Non-Functional Requirements

  • Low latency (sub-5ms checks)
  • High availability and distributed consistency
  • Global enforcement across edge and backend

๐Ÿ—๏ธ Core Components

  • Limiter Engine: Token Bucket or Sliding Window implementation
  • KV Store: Redis with TTLs per client key
  • API Gateway Middleware: Calls limiter on every request
  • Dashboard: Quota management and analytics

๐Ÿšฆ Token Bucket Logic (Python)


def allow_request(user_id):
    key = f"bucket:{user_id}"
    capacity, refill_rate = 100, 1.6  # tokens/min
    now = time.time()

    bucket = redis.hgetall(key)
    last_refill = float(bucket.get("last", now))
    tokens = float(bucket.get("tokens", capacity))

    elapsed = now - last_refill
    tokens = min(capacity, tokens + elapsed * refill_rate)

    if tokens >= 1:
        redis.hmset(key, {"tokens": tokens - 1, "last": now})
        return True
    else:
        return False
        

๐Ÿ”ง Redis TTL Key Setup


SET bucket:user123.tokens 50 EX 60 NX
        

๐Ÿ“Š API Rate Limit Header Example


HTTP/1.1 200 OK
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 37
X-RateLimit-Reset: 1691878800
        

๐Ÿงช Testing Strategy

  • High-throughput script with expected failure after N requests
  • Latency benchmarks under load
  • Simulate clock drift / multi-region collisions

๐Ÿ“ˆ Observability

  • Throttle counts per endpoint and region
  • Quota exhaustion alerts
  • Redis operation error rate and latency

๐Ÿงฐ Tools/Infra Used

  • Redis: Fast atomic operations with TTL support
  • Go/Python: Middleware or service implementation
  • API Gateway: Envoy, Kong, or NGINX with plugin
  • Grafana: Metrics dashboard for quotas

๐Ÿ“Œ Final Insight

Rate limiting is critical for API safety. Use a resilient store (like Redis), fair algorithms (Token Bucket), and good telemetry to protect infrastructure while preserving user experience.