Todor Bogosavljević

2024/12/16

BIP39: Mnemonics and deterministic wallets

Programming Cryptography

A personal story

Lately I’ve been investing some of my money into cryptocurrency, and I wanted to back up my private key in the case my wallet provider revokes my access to the wallet, or in case it gets shut down. While attempting to do this, I wasn’t able to locate my private key, but I was able to find 12 mysterious words which lead me down a caffeine fueled rabbit hole from which this post was born.

Private keys

In the realm of cryptocurrency, your private key is everything. It’s the authorization you need to transfer funds, and prove your ownership of the wallet. Although this is a mathematically sound system, we humans are not build to memorize long strings of random characters. For this reason, and improvement was proposed called BIP39.

BIP39

BIP39 is a BitCoin improvement proposal that defines a standard way to generate deterministic wallets from mnemonic phrases. This in a nutshell means, that it will generate a wallet key from a set of words that are easily remembered. Take the following words:

calm option expect pizza common sad eight crack acid crew brief kitchen

From these 12 words, we can fully generate a wallet as such:

85c56e10d4b293fac135c6ebf7dcf09208c0e23b2025d94a17b9ca3613edb462

This allows you to have these simple 12 words which allow you to fully recover / gain access to your funds in the event that you lose your private key. Some wallets, such as Proton Wallet do not even expose your private key to you (even though they are self-governed), but they do give you access to these 12 words, which (for all purposes that matter) are your private key.

How does it work?

Cryptography in general requires very precise steps, but if you follow them, the process itself is very simple. Let’s go through the process:

  1. Entropy generation

    A random sequence of bits, called entropy is generated. The lenght of these bits determines the word output

    • 256 bits will produce 12 words
    • 512 bits will produce 24 words
  2. Checksum calculation

    A checksum is computed by taking the first ENT / 32 bits of the SHA-256 hash of the entropy, after which the entropy and checksum are concatenated.

  3. Mnemonic encoding (where the words come out!)

    The combined sequence is split into 11-bit segments, each corresponding to an index in a predefined wordlist, and the resulting words are mapped to these indices, resulting in the final mnemonic phrase.

Example

We take the generated entropy such as: 1101001010111010...

And the resulting checksum: 0010...

Concatenate them like such: 11010010101110100010...

Split them into segments: 110100101011 / 10100010...

And map them to words using the words list provided: "apple banana cherry ..."

Wordlist

This process uses a fixed wordlist containing 2048 unique words. Each word in the list is carefully chosen to:

It is also worth noting that the wordlist is language-specific, with versions available in multiple languages such as English, French, and Japanese.

Seed generation

This process does not create a private key itself, but rather a seed which is then used to generate a private key with the PBKDF2 (Password-Based Key Derivation Function 2) algorithm as such:

But nothing is perfect in the cryptography space

Mainly, checksums. Although this algorithm corrects minor errors, it doesn’t handle bigger errors such as a wrong / mispelled word. In that case, your wallet would be forever lost.

In conclusion

BIP39 revolutionized wallet key management by combining ease of use with robust cryptographic principles. It’s genius that is very simple for the end-user to reap the benefits of. I hope that you come out of this understanding its mnemonic generation and technical structure a bit better, to proect cryptocurrency holdings. However, as with any cryptographic tool, best practices and vigilance are essential to maximizing security. There is no magic, only math.