DeepReach
System Design12 interview questions

Designing a Rate Limiter for Distributed Systems

Token bucket vs leaky bucket vs sliding window. How to implement rate limiting at scale with Redis, and handling edge cases in distributed environments.

System DesignRedisRate LimitingDistributed

Designing a Rate Limiter for Distributed Systems

Algorithms

Token Bucket

  • Tokens added at fixed rate, up to a maximum bucket size
  • Requests consume tokens; rejected if bucket empty
  • Allows bursting up to bucket size

Sliding Window Log

  • Track timestamps of all requests in a window
  • Count requests in last N seconds
  • More accurate but higher memory usage

Fixed Window Counter

  • Increment counter per time window
  • Reset at window boundary
  • Vulnerable to boundary burst (2x traffic at window edge)

Redis-Based Implementation

lua
-- Token bucket in Lua (atomic in Redis)
local key = KEYS[1]
local rate = tonumber(ARGV[1])
local capacity = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
local requested = tonumber(ARGV[4])

local bucket = redis.call("HMGET", key, "tokens", "last_refill")
local tokens = tonumber(bucket[1]) or capacity
local last_refill = tonumber(bucket[2]) or now

local elapsed = now - last_refill
local refill = math.floor(elapsed * rate)
tokens = math.min(capacity, tokens + refill)

if tokens >= requested then
    tokens = tokens - requested
    redis.call("HMSET", key, "tokens", tokens, "last_refill", now)
    redis.call("EXPIRE", key, math.ceil(capacity / rate) * 2)
    return 1
else
    return 0
end

Interview Questions

  1. 1.How do you handle clock skew between distributed nodes?
  2. 2.What happens to rate limiter state during a Redis failover?
  3. 3.How would you rate limit per user vs per IP vs per API key?
  4. 4.What is the difference between rate limiting and throttling?