When One Second Crashed the Internet: The Hidden Complexity of Timekeeping
At 23:59:60 UTC on June 30, 2012, a second was added to the world’s clocks. Within minutes, Reddit, LinkedIn, Mozilla, Gawker, and dozens of other major websites had crashed. Their servers were running at 100% CPU, locked in tight loops that made them completely unresponsive. The culprit wasn’t a cyberattack or a hardware failure—it was the handling of a single extra second. The Linux kernel’s high-resolution timer subsystem, called hrtimer, had gotten confused by the leap second. When the system clock stepped backward by one second, sleeping processes were awakened prematurely, flooding the CPU with activity. Java-based applications like Cassandra—the database powering Reddit—were particularly affected. The site was offline for over an hour. ...