System Design12 interview questions

Designing a Rate Limiter for Distributed Systems

Token bucket vs leaky bucket vs sliding window. How to implement rate limiting at scale with Redis, and handling edge cases in distributed environments.

System DesignRedisRate LimitingDistributed

Designing a Rate Limiter for Distributed Systems

Algorithms

Token Bucket

Tokens added at fixed rate, up to a maximum bucket size
Requests consume tokens; rejected if bucket empty
Allows bursting up to bucket size

Sliding Window Log

Track timestamps of all requests in a window
Count requests in last N seconds
More accurate but higher memory usage

Fixed Window Counter

Increment counter per time window
Reset at window boundary
Vulnerable to boundary burst (2x traffic at window edge)

Redis-Based Implementation

lua

-- Token bucket in Lua (atomic in Redis)
local key = KEYS[1]
local rate = tonumber(ARGV[1])
local capacity = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
local requested = tonumber(ARGV[4])

local bucket = redis.call("HMGET", key, "tokens", "last_refill")
local tokens = tonumber(bucket[1]) or capacity
local last_refill = tonumber(bucket[2]) or now

local elapsed = now - last_refill
local refill = math.floor(elapsed * rate)
tokens = math.min(capacity, tokens + refill)

if tokens >= requested then
    tokens = tokens - requested
    redis.call("HMSET", key, "tokens", tokens, "last_refill", now)
    redis.call("EXPIRE", key, math.ceil(capacity / rate) * 2)
    return 1
else
    return 0
end

Interview Questions

1.How do you handle clock skew between distributed nodes?
2.What happens to rate limiter state during a Redis failover?
3.How would you rate limit per user vs per IP vs per API key?
4.What is the difference between rate limiting and throttling?