When One Second Crashed the Internet: The Hidden Complexity of Timekeeping

At 23:59:60 UTC on June 30, 2012, a second was added to the world’s clocks. Within minutes, Reddit, LinkedIn, Mozilla, Gawker, and dozens of other major websites had crashed. Their servers were running at 100% CPU, locked in tight loops that made them completely unresponsive. The culprit wasn’t a cyberattack or a hardware failure—it was the handling of a single extra second. The Linux kernel’s high-resolution timer subsystem, called hrtimer, had gotten confused by the leap second. When the system clock stepped backward by one second, sleeping processes were awakened prematurely, flooding the CPU with activity. Java-based applications like Cassandra—the database powering Reddit—were particularly affected. The site was offline for over an hour. ...

12 min · 2481 words

Why One Second Brought Down Cloudflare DNS: The Hidden Complexity of Time

At midnight UTC on January 1, 2017, deep inside Cloudflare’s custom RRDNS software, a number went negative when it should have always been at least zero. This single value caused DNS resolutions to fail across Cloudflare’s global network. The culprit? A leap second—one extra tick of the clock that most people never noticed. The bug revealed a fundamental truth that every programmer eventually learns the hard way: time is not what you think it is. It doesn’t flow uniformly forward. It jumps, skips, and occasionally rewinds. And if your code assumes otherwise, it will break in ways that are nearly impossible to predict. ...

9 min · 1899 words