How Password Hashing Actually Works: From Rainbow Tables to Memory-Hard Functions

On June 5, 2012, a Russian hacker named Yevgeniy Nikulin accessed LinkedIn’s database and exfiltrated 6.5 million password hashes. What happened next became a textbook case of what not to do with passwords. LinkedIn had stored those passwords using SHA-1—without any salt. Within hours, security researchers were cracking thousands of passwords per minute. By the time LinkedIn disclosed the breach, over 60% of the stolen hashes had already been reversed.

The breach exposed a fundamental truth that many developers still ignore: the choice of hashing algorithm matters more than almost any other security decision. A password like “linkedin123” hashed with SHA-1 offers essentially no protection against modern attacks. The same password, properly hashed with Argon2id, would require decades of GPU time to crack. The difference isn’t the password—it’s the math.

The Speed Problem: When Fast Becomes Fatal

Cryptographic hash functions like MD5 and SHA-1 were designed for a specific purpose: verifying data integrity as quickly as possible. When you download a file, you want its checksum computed in milliseconds. Speed is a feature, not a bug.

But for password storage, speed is exactly what you don’t want. An RTX 4090 GPU can compute approximately 80 billion MD5 hashes per second. Let that sink in: every second, a consumer graphics card can try 80 billion password guesses. A 12-character password using lowercase letters has about 95 trillion possible combinations. At 80 billion hashes per second, that entire keyspace is exhausted in under 20 minutes.

The mathematics are brutal:

$$\text{Time to crack} = \frac{\text{Keyspace}}{\text{Hashes per second}}$$

For an 8-character password with uppercase, lowercase, numbers, and symbols (95 characters, $95^8 \approx 6.6 \times 10^{15}$ combinations), an MD5 hash falls in under 14 seconds on a single modern GPU.

This is why using fast hash functions for password storage isn’t just bad practice—it’s practically negligent. The same computational efficiency that makes SHA-256 excellent for blockchain verification makes it catastrophically wrong for protecting user credentials.

Rainbow Tables: The Precomputation Attack

Before we understood how dangerous fast hashes were, attackers developed an even more efficient approach: rainbow tables. Instead of computing hashes on demand, why not precompute them once and look them up?

A rainbow table is a massive database mapping hash values back to their original passwords. The technique, formalized by Philippe Oechslin in 2003, uses a clever optimization called “hash chains” to compress what would otherwise be petabytes of data into manageable sizes.

The process works like this: start with a plaintext password, hash it, apply a reduction function to convert the hash back into a new plaintext candidate, hash that, and repeat. The chain stores only the starting plaintext and the final hash. When you encounter a hash you want to crack, you apply the reduction and hashing process until you find a match with a chain’s endpoint, then reconstruct the chain from its starting point to find the original password.

block-beta
    columns 6
    block:chain1:6
        A["plaintext<br/>aaaaaa"]
        B["Hash H()"]
        C["hash1"]
        D["Reduce R1()"]
        E["plaintext<br/>culture"]
        F["Hash H()"]
    end
    space:6
    block:chain2:6
        G["hash2"]
        H["Reduce R2()"]
        I["plaintext<br/>linux23"]
        J["Hash H()"]
        K["hash3"]
        L["Reduce R3()"]
    end
    space:6
    block:chain3:6
        M["plaintext<br/>rambo"]
        N["Hash H()"]
        O["hash4"]
        P["Reduce R4()"]
        Q["plaintext<br/>re3xes"]
        R["Hash H()"]
    end
    space:6
    block:endpoint:6
        S["hash5"]
        T["Reduce R5()"]
        U["plaintext<br/>kiebgt<br/>(endpoint)"]
        V[" "]
        W[" "]
        X[" "]
    end

Rainbow tables made short work of unsalted hashes. A standard MD5 rainbow table covering common passwords could be downloaded from the internet and used to crack millions of hashes in seconds—no GPU required. The LinkedIn breach demonstrated this perfectly: because the SHA-1 hashes lacked salts, attackers could simply look up each hash in their precomputed tables.

Why Salting Stops Rainbow Tables

A cryptographic salt is a random value, unique to each password, that’s combined with the password before hashing. Instead of computing $H(\text{password})$, you compute $H(\text{password} + \text{salt})$.

This simple change defeats rainbow tables completely. Precomputing a rainbow table for one specific salt is expensive but possible. But if every password has a different salt, an attacker would need to build a separate rainbow table for each one—defeating the entire purpose of precomputation.

Consider two users with the password “password123”:

User A: $H(\text{password123} + \text{a7f3b2}) = \text{unique\_hash\_A}$
User B: $H(\text{password123} + \text{9c1e4d}) = \text{unique\_hash\_B}$

The same password produces completely different hashes. An attacker who compromises the database sees no patterns—they must crack each hash individually, starting from scratch.

bcrypt: Making Speed Expensive

In 1999, Niels Provos and David Mazières presented a paper at USENIX that would fundamentally change password security. Their creation, bcrypt, introduced a deceptively simple idea: what if you could make hashing arbitrarily slow?

Bcrypt is built on the Blowfish cipher, specifically exploiting its “expensive key setup” phase. The algorithm’s name—“Eksblowfish”—is a pun on “expensive key schedule.” Where normal Blowfish sets up its encryption keys efficiently, bcrypt deliberately makes this process slow and configurable.

The Work Factor

Bcrypt introduces a cost parameter (also called work factor) that determines how many iterations of key expansion to perform. This parameter is exponential: a work factor of 10 means $2^{10} = 1,024$ iterations. A work factor of 12 means $2^{12} = 4,096$ iterations.

The genius of this design is its adaptability. In 1999, a work factor of 6 might have provided adequate security. Today, with GPUs billions of times faster, you can simply increase the work factor to 12 or 14. The same algorithm grows stronger as hardware improves.

A bcrypt hash output looks like this:

$2b$12$LQv3c1yqBWVHxkd0LHAkCOYz6TtxMQJqhN8/LewY5GyYl3/0NQxOa

Breaking this down:

$2b$ — Algorithm identifier (bcrypt, version 2b)
12 — Work factor (2^12 = 4,096 iterations)
LQv3c1yqBWVHxkd0LHAkCOYz6TtxMQJqhN8/LewY5GyYl3/0NQxOa — Base64-encoded salt (22 characters) and hash (31 characters)

The work factor’s exponential scaling means each increment doubles the computation time. Going from work factor 12 to 13 makes password verification twice as slow for both legitimate users and attackers. But here’s the crucial asymmetry: your server processes one login attempt per user; an attacker processes billions of guesses.

Why bcrypt Resists GPU Attacks

Modern GPUs achieve their incredible speeds through massive parallelism—thousands of cores working simultaneously. But bcrypt’s design intentionally uses operations that don’t parallelize well. The Blowfish key schedule involves repeated operations where each step depends on the previous one. You can’t compute step 1,000 until you’ve finished step 999.

This dependency chain means a GPU core, no matter how fast, can’t process bcrypt significantly quicker than a CPU. The RTX 4090 that computes 80 billion MD5 hashes per second manages only about 100,000 bcrypt hashes per second at work factor 12. That’s an 800,000x reduction in cracking speed.

But bcrypt isn’t perfect. Its 72-byte input limit means very long passwords get truncated. Its 4 KB memory footprint, while challenging for GPUs in 1999, is trivially small for modern hardware with gigabytes of VRAM.

Memory-Hard Functions: The Next Frontier

By the early 2010s, password crackers had found bcrypt’s weak point: memory. While bcrypt’s CPU-bound operations slowed down GPUs, specialized ASICs (Application-Specific Integrated Circuits) could still achieve reasonable speeds. The problem was that bcrypt only required about 4 KB of memory—nothing for hardware designed with gigabytes of RAM.

Colin Percival addressed this in 2009 with scrypt, introducing the concept of memory-hard functions. The idea: make the hashing algorithm require significant memory to compute efficiently. If you try to compute it with less memory, the computation becomes exponentially slower.

This creates a fundamental trade-off for attackers. Custom hardware for password cracking faces a physical constraint: memory is expensive, both in silicon area and power consumption. An ASIC that can compute SHA-256 in a few thousand gates needs millions of gates for a memory-hard function—and each gate adds cost, heat, and power draw.

The Password Hashing Competition

In 2013, recognizing the need for a standardized memory-hard algorithm, the cryptographic community launched the Password Hashing Competition (PHC). Inspired by the successful SHA-3 competition, PHC sought to identify the best password hashing algorithm through open, peer-reviewed analysis.

The competition received 24 submissions, which were evaluated on security, performance, flexibility, and simplicity. On July 20, 2015, the winner was announced: Argon2.

Argon2: Memory-Hard by Design

Argon2 was designed by Alex Biryukov, Daniel Dinu, and Dmitry Khovratovich at the University of Luxembourg. It comes in three variants, each optimized for different threat models:

Argon2d: Uses data-dependent memory access, providing maximum resistance against GPU cracking but potentially vulnerable to side-channel timing attacks
Argon2i: Uses data-independent memory access, resistant to side-channel attacks but slower
Argon2id: A hybrid that uses data-independent access for the first pass and data-dependent access for subsequent passes

OWASP recommends Argon2id as the default choice for password hashing, as it provides balanced protection against both GPU attacks and side-channel attacks.

How Argon2 Uses Memory

Argon2 works by filling a large block of memory with pseudorandom data derived from the password and salt, then repeatedly transforming this data in ways that require random access across the entire memory block.

Here’s the crucial insight: if you don’t store the full memory block, you can’t compute the final hash efficiently. Every time the algorithm needs a value from memory, you’d have to recompute it from scratch. This recomputation cascades exponentially—if you try to use half the required memory, the computation might take 100 times longer.

The algorithm accepts three tunable parameters:

Memory cost (m): Amount of memory in kilobytes to use
Time cost (t): Number of iterations over the memory
Parallelism (p): Number of parallel threads (lanes)

OWASP recommends one of these equivalent configurations:

m=47104 (46 MiB), t=1, p=1
m=19456 (19 MiB), t=2, p=1
m=12288 (12 MiB), t=3, p=1

These configurations provide equivalent security; the trade-off is between RAM usage and CPU time.

The GPU Speedup Gap

The difference between memory-soft and memory-hard functions becomes stark when comparing GPU acceleration:

Algorithm	GPU Speedup vs CPU	Reason
MD5	~620,000x	No memory, highly parallelizable
SHA-256	~50,000x	No memory, highly parallelizable
bcrypt (work factor 12)	~100x	Limited parallelism, 4 KB memory
Argon2id (64 MB)	~1.5x	Memory-hard, bandwidth limited

With Argon2id using 64 MB of memory, an attacker’s GPU provides almost no advantage over a CPU. The memory bandwidth becomes the bottleneck, not the computational cores. This fundamentally changes the economics of password cracking—custom ASICs no longer provide a meaningful speedup over commodity hardware.

The Trade-off Triangle: Memory, Time, and Security

Every password hashing algorithm involves trade-offs between three factors:

Security: How resistant the hash is to cracking attempts
Memory usage: How much RAM is required to compute the hash
Computation time: How long it takes to compute a hash

Bcrypt optimizes for simplicity and reasonable security with minimal memory. Argon2 optimizes for maximum security at the cost of complexity and memory. PBKDF2 optimizes for FIPS compliance but lacks memory-hardness.

The right choice depends on your constraints:

Memory-constrained environments (embedded systems, microservices): bcrypt remains a solid choice
High-security applications (password managers, financial services): Argon2id with 64+ MB memory
FIPS-compliant systems: PBKDF2 with SHA-256 and 600,000+ iterations
Legacy systems: Upgrade to Argon2id or bcrypt if possible; at minimum, increase iterations

Practical Recommendations

For new applications, OWASP’s recommendations are clear:

Use Argon2id as the default with at least 19 MiB memory, 2 iterations, and 1 degree of parallelism
If Argon2id isn’t available, use bcrypt with a work factor of at least 10
Never use MD5, SHA-1, SHA-256, or any fast hash for password storage
Always include a unique salt per password (modern algorithms handle this automatically)

For verification time, aim for 100-500 milliseconds per hash. This is fast enough for legitimate users but slow enough to make bulk cracking economically infeasible. Remember: attackers work at the scale of billions of attempts; defenders work at the scale of one.

The 2012 LinkedIn breach taught the industry an expensive lesson. Six and a half million users learned that their passwords—including reused passwords across other sites—were suddenly in criminals’ hands. The breach could have been prevented with proper salting, or made far less damaging with bcrypt or Argon2.

Password hashing isn’t just about algorithms—it’s about understanding the economics of attacks. When you make each guess expensive, you change the calculus for attackers. A password that costs $0.00001 to verify on your server might cost $100 to crack with proper hashing, or $0.0000001 with improper hashing. That million-fold difference is the difference between a secure system and tomorrow’s breach headline.

References

Provos, N., & Mazières, D. (1999). A Future-Adaptable Password Scheme. USENIX Annual Technical Conference.
Oechslin, P. (2003). Making a Faster Cryptanalytic Time-Memory Trade-Off. CRYPTO 2003.
Biryukov, A., Dinu, D., & Khovratovich, D. (2015). Argon2: the memory-hard function for password hashing and other applications. Password Hashing Competition.
OWASP Foundation. (2024). Password Storage Cheat Sheet. OWASP Cheat Sheet Series.
Wikipedia. (2025). 2012 LinkedIn hack. Retrieved from https://en.wikipedia.org/wiki/2012_LinkedIn_hack
Hive Systems. (2025). The 2025 Hive Systems Password Table. Retrieved from https://www.hivesystems.com/blog/are-your-passwords-in-the-green
Password Hashing Competition. (2015). Argon2 Specification. Retrieved from https://www.password-hashing.net/argon2-specs.pdf

The Speed Problem: When Fast Becomes Fatal#

Rainbow Tables: The Precomputation Attack#

Why Salting Stops Rainbow Tables#

bcrypt: Making Speed Expensive#

The Work Factor#

Why bcrypt Resists GPU Attacks#

Memory-Hard Functions: The Next Frontier#

The Password Hashing Competition#

Argon2: Memory-Hard by Design#

How Argon2 Uses Memory#

The GPU Speedup Gap#

The Trade-off Triangle: Memory, Time, and Security#

Practical Recommendations#

References#