Rate Limiter System Design

Comprehensive analysis of rate limiting algorithms for 100M DAU at 1.15M RPS

System Requirements & Estimates

100M
Daily Active Users
1.15M
Requests Per Second
~10GB
Memory for Cache
429
Rate Limit Status Code

Functional Requirements

Non-Functional Requirements

Algorithm Comparison Matrix

Algorithm Burst Handling Storage Usage Complexity Latency Best Use Case
Token Bucket ✅ Allows up to bucket size Low (~4-5GB) Medium O(1) API rate limiting with burst allowance
Leaky Bucket ❌ No bursts allowed Low (~4-5GB) Low O(1) Steady flow rate enforcement
Fixed Window ⚠️ Edge case issues Very Low (~2GB) Very Low O(1) Simple rate limiting needs
Sliding Window Counter ✅ Smooth distribution Medium (~6GB) Medium O(1) Balanced accuracy and efficiency
Sliding Window Log ✅ Most accurate High (~800GB) High O(log N) High-precision rate limiting

Interactive Algorithm Visualizations

Token Bucket Algorithm

Tokens: 10/10

Capacity: 10 tokens | Refill Rate: 3 tokens per refill
Each request consumes 1 token. Requests are rejected when bucket is empty.

Leaky Bucket Algorithm

Requests leak at constant rate. Overflow results in 429 rejection.

Sliding Window Visualization

Window slides continuously, counting requests within the time frame.

Data Store Architecture - Redis

Why Redis?

Sample Implementation - Token Bucket

import redis
import time
from typing import Tuple

class TokenBucketRateLimiter:
    def __init__(self, redis_client: redis.Redis):
        self.redis = redis_client
        self.bucket_capacity = 100  # Maximum tokens
        self.refill_rate = 10      # Tokens per second
        
    def allow_request(self, client_id: str) -> Tuple[bool, int]:
        """Check if request is allowed using token bucket algorithm"""
        key = f"rate_limit:{client_id}"
        
        # Lua script for atomic token bucket operation
        lua_script = """
        local key = KEYS[1]
        local capacity = tonumber(ARGV[1])
        local refill_rate = tonumber(ARGV[2])
        local now = tonumber(ARGV[3])
        
        local bucket = redis.call('HMGET', key, 'tokens', 'last_refill')
        local tokens = tonumber(bucket[1]) or capacity
        local last_refill = tonumber(bucket[2]) or now
        
        -- Calculate tokens to add based on time passed
        local elapsed = now - last_refill
        local new_tokens = math.min(capacity, tokens + elapsed * refill_rate)
        
        if new_tokens >= 1 then
            -- Consume a token
            redis.call('HMSET', key, 'tokens', new_tokens - 1, 'last_refill', now)
            redis.call('EXPIRE', key, 3600)
            return {1, new_tokens - 1}
        else
            return {0, new_tokens}
        end
        """
        
        result = self.redis.eval(
            lua_script, 
            1, 
            key, 
            self.bucket_capacity, 
            self.refill_rate, 
            time.time()
        )
        
        return bool(result[0]), int(result[1])

Partitioning Strategy

Key Takeaways & Recommendations

💪 Strengths Demonstrated

  • Deep understanding of algorithm trade-offs
  • Comprehensive comparison across multiple dimensions
  • Strong analytical skills in evaluating storage and latency
  • Practical implementation considerations

📈 Areas for Improvement

  • Consider distributed clock skew handling
  • Explore cost trade-offs for different storage solutions
  • Address edge cases in distributed systems
  • Discuss graceful degradation strategies

🎯 Final Recommendation

Token Bucket emerges as the optimal choice for most production scenarios, offering:

  • Best balance between burst handling and rate enforcement
  • Reasonable storage requirements (~5GB for 100M users)
  • O(1) latency with atomic Redis operations
  • Production-proven at scale (used by AWS, Google Cloud)

For specific use cases requiring strict flow control, consider Leaky Bucket. For simpler needs with lower accuracy requirements, Fixed Window suffices.