When the Internet Collapsed: The 40-Year Evolution of TCP Congestion Control

In October 1986, something alarming happened on the Internet. Data throughput between Lawrence Berkeley Laboratory and UC Berkeley—sites separated by just 400 yards and two network hops—dropped from 32 Kbps to 40 bps. That is not a typo. The throughput collapsed by a factor of 1000. The Internet was experiencing its first “congestion collapse,” and nobody knew how to fix it. Van Jacobson, then at Lawrence Berkeley Laboratory, became fascinated by this catastrophic failure. His investigation led to a landmark 1988 paper titled “Congestion Avoidance and Control,” which introduced the fundamental algorithms that still govern how data flows through the Internet today. The story of TCP congestion control—from those desperate early fixes to modern algorithms like CUBIC and BBR—is really a story about how we learned to share a finite resource without a central coordinator. ...

9 min · 1856 words

When Pakistan Accidentally Took Down YouTube: The Fragile Trust Model of BGP

On February 24, 2008, at 18:47 UTC, Pakistan Telecom (AS17557) started announcing a more specific route to YouTube’s IP prefix: 208.65.153.0/24. Within minutes, YouTube traffic from around the world was being redirected to Pakistan. The Pakistan government had ordered the ISP to block YouTube domestically, but a configuration error caused the route to propagate globally through PCCW Global (AS3491), their upstream provider. YouTube engineers responded approximately 80 minutes later by announcing even more specific routes (/25 prefixes) to reclaim their traffic. By 21:01 UTC, the hijack was over. But for nearly two hours, a single misconfiguration in one country had effectively stolen one of the world’s most popular websites. ...

13 min · 2626 words

How One Router Misconfiguration Took Down Facebook: The Fragile Architecture of BGP

On October 4, 2021, at 15:40 UTC, Facebook disappeared from the internet. Not just the social network—Instagram, WhatsApp, and even Facebook’s internal tools went dark. Engineers couldn’t access their own data centers. The outage lasted nearly six hours and affected billions of users worldwide. The cause wasn’t a cyberattack or a data center failure. It was a BGP configuration error. Someone issued a command that withdrew the routes Facebook used to announce its presence to the internet, and within minutes, the company’s entire network became unreachable. ...

11 min · 2280 words