System Architecture

At 2:17 AM on a Tuesday, a major e-commerce platform’s API went down. The incident report later revealed the root cause: a misconfigured rate limiter had allowed a burst of requests through at exactly the boundary between two time windows, overwhelming downstream services. The platform had implemented a fixed window counter—the simplest rate limiting algorithm—and paid the price for its simplicity. Rate limiting seems straightforward: allow N requests per time period. But the algorithm you choose determines not just whether your system survives traffic spikes, but how fairly it treats users, how much memory it consumes, and whether it creates new failure modes you never anticipated. The difference between algorithms isn’t academic—it’s the difference between a system that degrades gracefully and one that cascades into total failure. ...