A poker site once lost millions because its shuffling algorithm could be predicted. The root cause? A random number generator that wasn’t random at all. The engineers had used a predictable seed, and attackers reverse-engineered the entire deck sequence from just a few observed hands.
This wasn’t an isolated incident. From lottery rigging scandals to cryptocurrency wallet thefts, the history of computing is littered with disasters caused by insufficient randomness. Yet here’s the paradox: computers are deterministic machines. They execute the same instruction, they get the same result. So where does randomness actually come from?
The answer reveals one of the most elegant architectures in modern computing—a multi-layered system that harvests chaos from the physical world, distills it through mathematical algorithms, and delivers it to applications that depend on unpredictability for their very existence.
The Fundamental Problem: Determinism vs. Randomness
At the transistor level, a computer is a clockwork. Every operation follows from the previous one. Feed the same input to a function, and you’ll get the same output—this is a feature, not a bug. Imagine the chaos if your spreadsheet calculated 2+2 differently each time.
But cryptography, simulations, and games all require genuine unpredictability. An encryption key that an attacker can guess is worthless. A simulation that produces the same results every time teaches nothing. A game where players can predict the next card has broken economics.
This tension—between the deterministic nature of computers and the need for unpredictable output—defines the entire field of random number generation.
True Random vs. Pseudorandom: Two Fundamentally Different Approaches
Random number generators fall into two categories that differ not just in implementation but in philosophy.
True Random Number Generators (TRNGs) measure physical phenomena that are inherently unpredictable. Thermal noise in circuits, timing variations in hardware, radioactive decay, atmospheric noise—these processes are governed by quantum mechanics or thermodynamic chaos. Even in principle, they cannot be predicted.
Pseudorandom Number Generators (PRNGs) are deterministic algorithms that take a small seed value and stretch it into a long sequence of numbers that appear random. Given the same seed, they produce the identical sequence—every time.
The distinction matters enormously for security. A TRNG’s output is unpredictable even if you know everything about how it works. A PRNG’s output is completely predictable if you know the seed.
Here’s the key insight: modern systems use both. They harvest true randomness from physical sources to seed PRNGs, which then efficiently generate the massive amounts of “random” data that applications consume.
Where Does Entropy Actually Come From?
Entropy—in this context, a measure of unpredictability—is the fuel that powers random number generation. But harvesting it is surprisingly difficult.
Hardware Sources
Modern processors contain dedicated entropy harvesting mechanisms. Intel’s Ivy Bridge architecture, introduced in 2011, includes a Digital Random Number Generator (DRNG) that uses thermal noise in a metastable circuit to generate true random bits. Two instructions provide access: RDRAND returns random values from a hardware-seeded CSPRNG, while RDSEED provides access to the raw entropy source.
The thermal noise comes from the random motion of electrons in silicon at finite temperature. This noise is amplified by a circuit designed to be exquisitely sensitive to tiny variations. The result: a stream of bits that even Intel’s own engineers cannot predict.
AMD processors include similar capabilities. ARM processors often implement TRNGs as well, though implementations vary by vendor.
Operating System Sources
Before hardware RNGs became ubiquitous, operating systems had to be creative. Linux gathers entropy from:
- Keyboard and mouse timing: The precise microsecond when you press a key or move your mouse is influenced by countless factors—your muscle movements, the air currents, the electrical noise in the USB controller.
- Disk drive timing: Hard disk seeks take variable amounts of time due to mechanical factors. SSDs are more deterministic, but their internal wear-leveling algorithms introduce some variability.
- Network packet timing: The arrival times of network packets are influenced by network congestion, speed-of-light delays, and router queuing behaviors.
- Interrupt timing: Every hardware interrupt—timer tick, keyboard press, disk completion—occurs at a slightly different microsecond than expected due to system load and hardware variations.
These sources are mixed together using a cryptographic hash function. Even if some sources are predictable or compromised, the remaining ones contribute entropy that attackers cannot predict.
The Virtualization Problem
Virtual machines face a particular challenge. A VM has no physical keyboard or mouse. Its disk is often a virtual file on the host. Its network timing is controlled by the hypervisor. The entropy sources that a physical machine relies on are either absent or artificial.
This creates a boot-time entropy starvation problem. When a VM first boots, it has almost no entropy. Yet it immediately needs random numbers to generate SSH host keys, SSL certificates, and other security-critical values. Cloud providers address this by providing entropy via virtio-rng or similar mechanisms, injecting randomness from the host into guests.
The Linux Random Number Generator: A Deep Dive
Linux provides two primary interfaces for random numbers:
/dev/random: Historically, this blocked when the kernel estimated that entropy was depleted. Reading from it could hang your process for seconds or minutes./dev/urandom: This never blocks, drawing from the same entropy pool but continuing to produce output even when the kernel’s entropy estimate hits zero.
The distinction was once considered important for cryptographic security. Today, the consensus among cryptographers is different: always use /dev/urandom (or the getrandom() system call).
Here’s why. The kernel’s entropy estimate is conservative and somewhat arbitrary. When it hits zero, /dev/urandom continues to produce output from its internal CSPRNG—a construction called ChaCha20 that has withstood years of cryptographic analysis. The output remains unpredictable even when the entropy counter reads zero, because the internal state remains unknown to attackers.
Blocking on /dev/random, meanwhile, can cause security problems. Applications that block during boot waiting for entropy can delay security services, create timeout conditions, or fall back to less secure alternatives.
The getrandom() System Call
Introduced in Linux 3.17, getrandom() is the modern interface for obtaining random numbers. It addresses the file descriptor exhaustion attack that plagued /dev/urandom (an attacker could open all available file descriptors, preventing the application from opening /dev/urandom).
More importantly, getrandom() provides configurable blocking behavior. By default, it blocks during early boot until the entropy pool is initialized—preventing the use of predictable randomness during that critical window—but then never blocks again, providing the best of both worlds.
#include <sys/random.h>
ssize_t getrandom(void *buf, size_t buflen, unsigned int flags);
The flags parameter controls behavior: GRND_RANDOM uses the /dev/random pool (almost never needed), GRND_NONBLOCK returns immediately even if the pool isn’t initialized (useful but dangerous), and 0 provides the secure default.
The Architecture of Modern PRNGs
Linear Congruential Generators: Fast But Flawed
The simplest PRNGs use a linear recurrence:
$$X_{n+1} = (aX_n + c) \mod m$$For well-chosen constants a, c, and m, this produces a sequence that passes basic statistical tests. It’s incredibly fast—just one multiply, one add, and one modulo.
But LCGs are cryptographically worthless. Given a few consecutive outputs, an attacker can solve for the internal state and predict all future outputs. Even worse, LCGs exhibit patterns in high dimensions—plot successive outputs in 3D space, and they fall on hyperplanes.
Java’s java.util.Random uses an LCG variant. Never use it for anything security-related.
Mersenne Twister: Good for Simulation, Bad for Cryptography
For years, Mersenne Twister was the default PRNG in Python, Ruby, and many scientific computing libraries. It has an enormous period of $2^{19937}-1$ and passes most statistical tests.
But Mersenne Twister is not cryptographically secure. Given 624 consecutive 32-bit outputs, an attacker can recover the entire 2.5-kilobyte internal state and predict all future outputs. For simulations and games, this doesn’t matter. For cryptography, it’s fatal.
The algorithm also has a cold-start problem: if the seed has many zero bits, early outputs have poor statistical properties. Modern alternatives like PCG and xoroshiro address these issues while maintaining speed.
ChaCha20: The Modern Standard
When cryptographic security matters, modern systems increasingly use ChaCha20 as the core PRNG. Originally designed as a stream cipher by Daniel J. Bernstein, ChaCha20 has several desirable properties:
- It’s a CSPRNG: given a secure seed, its output is computationally indistinguishable from true randomness
- It’s resistant to timing attacks: all operations are constant-time
- It’s fast on software: designed for CPUs without hardware AES acceleration
- It provides forward secrecy: compromising the state doesn’t reveal past outputs
Linux 4.8 switched from a custom design to ChaCha20 for /dev/urandom. OpenBSD uses ChaCha20 for its arc4random() function. The Rust standard library uses it. The list goes on.
Cryptographic Requirements: What Makes a PRNG “Secure”?
A cryptographically secure PRNG (CSPRNG) must satisfy two requirements:
Next-bit test: Given the first k bits of output, no polynomial-time algorithm can predict the (k+1)th bit with probability significantly better than 1/2.
State compromise resistance: If an attacker learns the internal state at time t, they cannot:
- Predict future outputs (forward secrecy is violated if they can)
- Recover past outputs (backward secrecy is violated if they can)
This second property is crucial. Real systems get compromised. Keys get leaked. If learning the current PRNG state reveals all past outputs, then every encryption key, every TLS session, every generated password from the compromised period is also compromised.
Achieving backward secrecy requires periodically reseeding the PRNG with fresh entropy. Even if an attacker captures the state at one moment, they won’t know what entropy was mixed in before.
Historical Disasters: When Random Numbers Fail
The Debian OpenSSL Debacle (2008)
In 2006, a Debian developer noticed that Valgrind reported uninitialized memory reads in OpenSSL’s random number generation code. Following good software hygiene, he removed the “offending” code.
The removed code was feeding uninitialized stack memory into the entropy pool—a valuable source of randomness. Without it, OpenSSL on Debian-derived systems used only the process ID as entropy.
The result: every SSL key generated on Debian or Ubuntu between 2006 and 2008 came from a pool of only 32,767 possible values. Attackers could brute-force SSH keys in seconds. The vulnerability was assigned CVE-2008-0166 and affected countless systems worldwide.
The lesson: randomness is subtle. Code that looks like a bug might be a feature.
The Dual_EC_DRBG Backdoor (2006-2013)
In 2006, NIST published SP 800-90A, specifying several CSPRNG algorithms. One of them, Dual_EC_DRBG, was unusually slow and had other technical peculiarities. Cryptographers immediately suspected a backdoor.
The algorithm worked by generating points on an elliptic curve. It needed two curve points, P and Q. NIST provided values for these constants—but didn’t explain how they were chosen.
If someone knew a secret relationship between P and Q (if Q = dP for some secret d), they could predict the output from just a few observed values. This is exactly how a kleptographic backdoor works: the victim’s system generates random numbers normally, but the attacker can predict them.
In 2013, documents leaked by Edward Snowden confirmed that the NSA had paid RSA Security $10 million to make Dual_EC_DRBG the default in their BSAFE toolkit. NIST later withdrew the algorithm, but the incident damaged trust in standards processes.
The Android Bitcoin Wallet Theft (2013)
Android’s implementation of java.security.SecureRandom had a critical flaw. On certain devices, the entropy pool wasn’t properly initialized. Bitcoin wallets generated on affected devices produced predictable private keys.
Attackers scanned the blockchain for vulnerable addresses and drained them. The Bitcoin community estimated losses in the hundreds of thousands of dollars before patches were deployed.
Google released a patch and advised developers to explicitly seed their random number generators—a workaround that contradicted the whole point of having a SecureRandom class that should “just work.”
The Lottery Rigging Scandal (2017)
Eddie Tipton, the former security director of the Multi-State Lottery Association, installed a self-destructing rootkit on the lottery’s random number generator. The malicious code produced predictable jackpot numbers on specific dates.
Tipton and his associates collected winnings across multiple states before being caught. The scheme worked because the lottery used a pseudorandom number generator without proper auditing or hardware security modules.
Creative Entropy Sources: From Lava Lamps to Radioactive Decay
When standard entropy sources are insufficient, creative engineers have found alternatives.
Lava Lamps: The Wall of Entropy
In the lobby of Cloudflare’s San Francisco office, a camera points at a wall of about 100 lava lamps. The lamps are constantly in motion—the “lava” never forms the same pattern twice. People walk past, casting shadows. The camera’s sensors have random noise.
Each frame captured by the camera becomes a source of entropy. Even a 100×100 pixel image, assuming an attacker can predict each pixel’s value to within one bit of precision, contains 30,000 bits of entropy—far more than needed for any cryptographic purpose.
The entropy feed is mixed with other sources using a cryptographic hash, ensuring that compromise of any single source doesn’t compromise the whole system. Cloudflare calls this installation their “Wall of Entropy.”
Radioactive Decay
Cloudflare’s Singapore office uses a pellet of uranium (harmless in the small quantity used) as an entropy source. Radioactive decay is a quantum process—genuinely unpredictable even in principle. A Geiger counter measures the timing between decay events, and these timings become entropy.
Atmospheric Noise
The website random.org has been generating random numbers from atmospheric noise since 1998. Multiple radios tuned between stations capture the static caused by lightning strikes, solar radiation, and other atmospheric phenomena. The raw audio is processed and made available via API.
Quantum Random Number Generators: The Ultimate Source
The most fundamental source of randomness is quantum mechanics. Unlike thermal noise or timing variations, which might have hidden deterministic explanations, quantum events are provably random.
Quantum random number generators (QRNGs) exploit phenomena like:
- Photon path selection: A photon hitting a half-silvered mirror has exactly a 50% chance of being reflected or transmitted. Which path it takes is fundamentally unpredictable.
- Vacuum fluctuations: Even in a perfect vacuum, quantum field theory predicts that virtual particles continuously appear and disappear. Measuring these fluctuations yields random bits.
- Phase noise in lasers: The phase of laser light fluctuates randomly due to quantum effects.
Companies like ID Quantique and QuintessenceLabs sell QRNG devices. Some are small enough to fit in a smartphone. But QRNGs are typically slower than pseudo-random generators, so they’re usually used to seed a CSPRNG rather than provide direct output.
Statistical Testing: How Do We Know It’s Random?
Random number generators must be tested. A PRNG that produces statistical patterns might be broken even if no attack is publicly known.
The most widely used test suites are:
- NIST SP 800-22: A battery of 15 statistical tests, including frequency tests, runs tests, and spectral tests. Designed for evaluating cryptographic RNGs.
- Diehard and Dieharder: Developed by George Marsaglia, these tests look for subtle patterns that might escape simpler statistical analyses.
- TestU01: A comprehensive suite from Pierre L’Ecuyer’s lab, implementing hundreds of tests.
A good PRNG should pass all of these. But passing doesn’t prove security—it just means no obvious patterns were detected. A backdoored generator could pass all statistical tests while being predictable to someone with the secret key.
Practical Recommendations
For application developers:
Use your operating system’s CSPRNG. On Linux, use getrandom(). On Windows, use BCryptGenRandom(). In Java, use SecureRandom. In Python, use secrets module for security-critical code. These interfaces are maintained by people who understand the subtleties.
Never seed a PRNG with predictable values. Using the current time, process ID, or network address as a seed is catastrophically insecure. Modern CSPRNGs seed themselves properly—let them.
Don’t implement your own PRNG. The history of cryptography is littered with homemade random number generators that looked clever but fell to simple attacks. Use established, well-analyzed algorithms.
Be aware of virtualization. If you’re deploying to cloud VMs, ensure your images have entropy available during first boot. Cloud providers offer entropy injection services for exactly this reason.
Reseed periodically for long-running processes. If your server runs for months, periodically obtain fresh entropy from the OS. This maintains backward secrecy even if the internal state is eventually compromised.
The Future: Entropy Everywhere
The trend in computing is toward more entropy, from more sources, available everywhere. Hardware RNGs are now standard in consumer CPUs. Cloud providers inject entropy into VMs. Browsers provide the crypto.getRandomValues() API. Even JavaScript, long a wasteland of Math.random() insecurity, now has crypto.subtle for secure randomness.
The poker site disaster that opened this article wouldn’t happen today—or at least, it shouldn’t. The tools for secure randomness are widely available. The challenge now is ensuring they’re used correctly.
Every time you visit an HTTPS website, generate a password, or shuffle a virtual deck of cards, you’re relying on this invisible infrastructure. A few hundred bits of entropy, harvested from the chaotic physical world, stretched through mathematical algorithms, and delivered to the application that needs unpredictability. It’s one of the most critical yet least appreciated achievements of modern computing.
References
- Intel Corporation. (2018). Intel Digital Random Number Generator (DRNG) Software Implementation Guide. Intel Developer Zone.
- Gutterman, Z., Pinkas, B., & Reinman, T. (2006). Analysis of the Linux Random Number Generator. IEEE Symposium on Security and Privacy.
- Bernstein, D. J. (2008). ChaCha, a variant of Salsa20. Workshop Record of SASC.
- Schneier, B. (2007). Did NSA Put a Secret Backdoor in New Encryption Standard?. Wired.
- NIST. (2015). SP 800-90A Rev. 1: Recommendation for Random Number Generation Using Deterministic Random Bit Generators.
- Debian Security Advisory DSA-1571-1. (2008). New openssl packages fix predictable random number generator.
- Cloudflare. (2017). LavaRand in Production: The Nitty-Gritty Technical Details. Cloudflare Blog.
- Herrero-Collantes, M., & Garcia-Escartin, J. C. (2017). Quantum random number generators. Reviews of Modern Physics, 89(1).
- L’Ecuyer, P., & Simard, R. (2007). TestU01: A C library for empirical testing of random number generators. ACM Transactions on Mathematical Software.
- Wikipedia contributors. (2024). Dual_EC_DRBG. Wikipedia, The Free Encyclopedia.