System Design

Why Your Content Travels Faster Than Light: The Hidden Architecture of Content Delivery Networks

In 2006, Amazon discovered something that would reshape how the industry thinks about performance: every 100 milliseconds of latency cost them 1% in sales. That same year, Google found that adding just 500 milliseconds of delay to search results caused a 20% drop in traffic. These weren’t hypothetical concerns—they were measured impacts on real revenue. The physics of the internet imposes hard constraints. Light travels through fiber optic cable at roughly two-thirds its speed in vacuum—approximately 200,000 kilometers per second. A round trip from New York to Singapore covers about 30,000 kilometers of fiber, which means a theoretical minimum latency of 150 milliseconds just for light to make the journey. Add network equipment, routing hops, and protocol overhead, and real-world latency easily exceeds 200 milliseconds. ...

How Consistent Hashing Scales Distributed Systems: The Mathematics Behind Minimal Rebalancing

When Amazon engineers published the Dynamo paper in 2007, they revealed a technique that had been quietly powering some of the world’s largest distributed systems. The core idea—consistent hashing—originated from a 1997 MIT paper by David Karger and colleagues, but it took a decade before the industry fully embraced its elegance. Today, consistent hashing underpins Apache Cassandra, Amazon DynamoDB, Discord’s messaging infrastructure, Netflix’s content delivery network, and virtually every modern distributed database. The algorithm solves a deceptively simple problem: how do you distribute data across servers when those servers keep joining and leaving? ...

Why Databases Choose B+ Trees Over Hash Tables and B-Trees

When you create an index on a database table, have you ever wondered what data structure actually powers it? The answer is almost always a B+ tree. Not a hash table. Not a regular B-tree. Not a binary search tree. B+ trees have been the default index structure in nearly every major relational database for over five decades—MySQL, PostgreSQL, Oracle, SQL Server, and SQLite all use them. This isn’t coincidence or legacy inertia. It’s the result of fundamental trade-offs between disk I/O patterns, range query efficiency, and storage utilization. ...

When 99% Cache Hit Ratio Means Nothing: The Metrics You're Not Watching

A major e-commerce platform celebrated when their cache hit ratio hit 99.2%. Their dashboard showed beautiful green charts. Three days later, their database collapsed during a flash sale. The cache hit ratio never dropped below 98%. What went wrong? The team optimized for the wrong metric. While their cache served 99% of requests from memory, the 1% that missed were the most expensive queries—complex aggregations, joins across multiple tables, and expensive computations. A cache hit ratio tells you how often you avoid work, not how much work you’re avoiding. ...

How Search Engines Find a Needle in a 400 Billion-Haystack

When you type a query and hit enter, results appear in under half a second. Behind that instant response lies an engineering marvel: a system that must search through hundreds of billions of documents, score each one for relevance, and return the best matches—all before you can blink. The numbers are staggering. Google’s index contains approximately 400 billion documents according to testimony from their VP of Search during the 2023 antitrust trial. The index itself exceeds 100 million gigabytes. Yet the median response time for a search query remains under 200 milliseconds. ...

Why Your API Collapsed at 2AM: The Rate Limiting Algorithm You Chose Matters

At 2:17 AM on a Tuesday, a major e-commerce platform’s API went down. The incident report later revealed the root cause: a misconfigured rate limiter had allowed a burst of requests through at exactly the boundary between two time windows, overwhelming downstream services. The platform had implemented a fixed window counter—the simplest rate limiting algorithm—and paid the price for its simplicity. Rate limiting seems straightforward: allow N requests per time period. But the algorithm you choose determines not just whether your system survives traffic spikes, but how fairly it treats users, how much memory it consumes, and whether it creates new failure modes you never anticipated. The difference between algorithms isn’t academic—it’s the difference between a system that degrades gracefully and one that cascades into total failure. ...

Why Your Database Connection Pool of 100 Is Killing Performance

The Oracle Real-World Performance group published a demonstration that should have changed how every developer thinks about connection pools. They took a system struggling with ~100ms average response times and reduced those times to ~2ms—a 50x improvement. They didn’t add hardware. They didn’t rewrite queries. They reduced the connection pool size from 2048 connections down to 96. Most developers configure connection pools based on intuition: more users means more connections, right? A typical production configuration sets the pool to 100, 200, or even 500 connections “just to be safe.” This intuition is precisely backwards. The correct question isn’t how to make your pool bigger—it’s how small you can make it while still handling your load. ...