Swiftorial Logo
Home
Swift Lessons
AI Tools
Learn More
Career
Resources

API Design: Scenario-Based Questions

67. How do you design and implement rate limiting for APIs?

Rate limiting prevents abuse, ensures fair usage, and protects backend services from overload. It’s a fundamental part of API design for both public and internal interfaces.

πŸ“ Core Design Goals

  • Fairness: Prevent one client from starving others.
  • Security: Throttle brute-force or bot activity.
  • Stability: Shield services from traffic spikes or DoS patterns.

βš™οΈ Algorithms

  • Token Bucket: Allows bursty traffic; tokens refill at a fixed rate.
  • Leaky Bucket: Smooth output rate; drops excess requests.
  • Fixed Window: Simple counter per time window (e.g., 1000 req/min).
  • Sliding Window: More accurate smoothing over multiple windows.

πŸ“Š Dimensions of Limiting

  • Per IP address or user ID
  • Per API key or client app
  • Per endpoint (e.g., login stricter than GET /status)
  • Region-aware throttling

🧰 Implementation Options

  • API Gateway: Native support in Kong, AWS API Gateway, Apigee, etc.
  • Reverse Proxies: NGINX, Envoy, HAProxy with Lua or filters.
  • Middleware: Express.js, Flask, Django middleware with Redis counters.
  • Distributed: Redis, Memcached, or custom in-memory counter services.

βœ… Best Practices

  • Return 429 Too Many Requests with Retry-After headers.
  • Expose usage headers (X-RateLimit-Remaining, etc.).
  • Use global and per-service quotas.
  • Rate limit at multiple layers (edge, app, DB).

🚫 Common Pitfalls

  • Single-node in-memory counters (break under scale).
  • No visibility into rejected traffic patterns.
  • Inconsistent enforcement across services.

πŸ“Œ Final Insight

Rate limiting is not just about protection β€” it’s about control and predictability. Design with visibility, fairness, and user experience in mind.