Posts

How Git Actually Stores Your Code: The Hidden Architecture Behind Every Commit

On April 7, 2005, Linus Torvalds made the first commit to a new version control system. He had started coding it just four days earlier, on April 3rd, after the proprietary tool he had been using for Linux kernel development became unavailable. The kernel community needed something fast, distributed, and capable of handling thousands of contributors. What Torvalds built in those frantic days wasn’t just another version control system—it was a content-addressable filesystem disguised as one. ...

How Wi-Fi Actually Travels Through Walls: The Physics of Invisible Data

The coffee shop has free Wi-Fi. The password is posted on a chalkboard near the counter. You sit in the corner booth, open your laptop, and connect. The signal passes through three walls, a glass window, and a wooden partition before reaching your device. How? This isn’t a minor engineering achievement. Your router is broadcasting radio waves at frequencies measured in billions of cycles per second, encoding gigabytes of data into invisible electromagnetic fields, and somehow that signal arrives intact after bouncing off your refrigerator, penetrating your walls, and competing with your neighbor’s network. Understanding how this works requires peeling back layers of physics that most people never consider—electromagnetic wave behavior, material properties, and the mathematical cleverness of modern encoding schemes. ...

Why Message Queues Are Harder Than They Look: The Hidden Complexity of "Simple" Asynchronous Communication

A financial services company migrated their payment processing to a message queue architecture. The design seemed straightforward: producers publish payment requests, workers consume and process them. Six months later, they discovered their customers had been double-charged for approximately 3% of transactions. The queue was working exactly as configured—the problem was that “working” meant something different than they expected. Message queues occupy a strange position in distributed systems. They appear deceptively simple on the surface: put message in, get message out. But beneath that simplicity lies a maze of trade-offs involving durability, ordering, delivery guarantees, and failure modes. Understanding these trade-offs isn’t academic—it’s the difference between a reliable system and one that silently corrupts data. ...

When Zero-Copy Isn't Zero: The Hidden Copies in Your "Efficient" Code

A file sits on disk. Your application reads it and sends it over the network. Simple enough—but behind this mundane operation hides one of computing’s most persistent performance bottlenecks. In a traditional I/O path, that single file traverses through four distinct memory copies before reaching the network interface. The kernel reads data from disk into a kernel buffer via DMA. The read() system call copies it to user space. The write() system call copies it back to a kernel socket buffer. Finally, DMA transfers it to the NIC. Each copy consumes CPU cycles, memory bandwidth, and cache space. ...

Why You Cannot Simply Turn Off a Nuclear Reactor: The Physics of Decay Heat

On December 2, 1942, beneath the stands of a squash court at the University of Chicago, Enrico Fermi and his team achieved something humanity had never accomplished: a controlled, self-sustaining nuclear chain reaction. Chicago Pile-1, the world’s first nuclear reactor, produced just half a watt of power—barely enough to light a small bulb. Yet it demonstrated a principle that now generates about 9% of the world’s electricity, powering hundreds of millions of homes with the energy locked inside atomic nuclei. ...

When One Bit Can Kill: How Error Correction Codes Save Your Data Every Day

In 1947, a mathematician at Bell Labs faced a frustrating problem. Richard Hamming was using the Model V relay computer to perform calculations, and every weekend the machine would grind to a halt when it encountered an error. The computer would simply stop, flashing its error lights, and Hamming would have to wait until Monday for the operators to reload his program. One Friday evening, staring at the silent machine, he asked himself a question that would change computing forever: “Why can’t the computer correct its own mistakes?” ...

How Docker Actually Works: Namespaces, Cgroups, and the Linux Kernel Features That Make Containers Possible

In March 2013, Solomon Hykes demonstrated a new tool at PyCon that would fundamentally change how software gets deployed. He ran a simple command: docker run -i -t ubuntu /bin/bash. Within seconds, a complete Ubuntu environment appeared, ready to accept commands. The audience saw what looked like a lightweight virtual machine. What they were actually witnessing was something far more elegant: a single Linux process, wrapped in kernel features that had been maturing for over a decade. ...

Why Your Content Travels Faster Than Light: The Hidden Architecture of Content Delivery Networks

In 2006, Amazon discovered something that would reshape how the industry thinks about performance: every 100 milliseconds of latency cost them 1% in sales. That same year, Google found that adding just 500 milliseconds of delay to search results caused a 20% drop in traffic. These weren’t hypothetical concerns—they were measured impacts on real revenue. The physics of the internet imposes hard constraints. Light travels through fiber optic cable at roughly two-thirds its speed in vacuum—approximately 200,000 kilometers per second. A round trip from New York to Singapore covers about 30,000 kilometers of fiber, which means a theoretical minimum latency of 150 milliseconds just for light to make the journey. Add network equipment, routing hops, and protocol overhead, and real-world latency easily exceeds 200 milliseconds. ...

How Consistent Hashing Scales Distributed Systems: The Mathematics Behind Minimal Rebalancing

When Amazon engineers published the Dynamo paper in 2007, they revealed a technique that had been quietly powering some of the world’s largest distributed systems. The core idea—consistent hashing—originated from a 1997 MIT paper by David Karger and colleagues, but it took a decade before the industry fully embraced its elegance. Today, consistent hashing underpins Apache Cassandra, Amazon DynamoDB, Discord’s messaging infrastructure, Netflix’s content delivery network, and virtually every modern distributed database. The algorithm solves a deceptively simple problem: how do you distribute data across servers when those servers keep joining and leaving? ...

How Password Hashing Actually Works: From Rainbow Tables to Memory-Hard Functions

On June 5, 2012, a Russian hacker named Yevgeniy Nikulin accessed LinkedIn’s database and exfiltrated 6.5 million password hashes. What happened next became a textbook case of what not to do with passwords. LinkedIn had stored those passwords using SHA-1—without any salt. Within hours, security researchers were cracking thousands of passwords per minute. By the time LinkedIn disclosed the breach, over 60% of the stolen hashes had already been reversed. ...

The CORS Error That Cost $50,000: When Access-Control-Allow-Origin Becomes an Attack Vector

In October 2016, a security researcher discovered a misconfigured CORS endpoint on a major bitcoin exchange. By exploiting a simple header reflection vulnerability, they could have stolen users’ API keys, disabled notifications, enabled two-factor authentication to lock out account owners, and transferred bitcoins to any address. They reported it instead. The bug bounty payout was substantial. Three different bitcoin exchanges were found vulnerable to similar CORS misconfigurations during the same research period. ...

Why Databases Choose B+ Trees Over Hash Tables and B-Trees

When you create an index on a database table, have you ever wondered what data structure actually powers it? The answer is almost always a B+ tree. Not a hash table. Not a regular B-tree. Not a binary search tree. B+ trees have been the default index structure in nearly every major relational database for over five decades—MySQL, PostgreSQL, Oracle, SQL Server, and SQLite all use them. This isn’t coincidence or legacy inertia. It’s the result of fundamental trade-offs between disk I/O patterns, range query efficiency, and storage utilization. ...