bitcoinbook/selected BIPs/bip-0039.mediawiki

<pre>
  BIP:     BIP-0039
  Title:   Mnemonic code for generating deterministic keys
  Authors: Marek Palatinus <slush@satoshilabs.com>
           Pavol Rusnak <stick@satoshilabs.com>
           ThomasV <thomasv@bitcointalk.org>
           Aaron Voisine <voisine@gmail.com>
           Sean Bowe <ewillbefull@gmail.com>
  Status:  Draft
  Type:    Standards Track
  Created: 2013-09-10
</pre>

==Abstract==

This BIP describes the implementation of a mnemonic code or mnemonic sentence --
a group of easy to remember words -- for the generation of deterministic wallets.

It consists of two parts: generating the mnenomic, and converting it into a
binary seed. This seed can be later used to generate deterministic wallets using
BIP-0032 or similar methods.

==Motivation==

A mnenomic code or sentence is superior for human interaction compared to the
handling of raw binary or hexidecimal representations of a wallet seed. The
sentence could be written on paper or spoken over the telephone.

This guide meant to be as a way to transport computer-generated randomnes over
human readable transcription. It's not a way how to process user-created
sentences (also known as brainwallet) to wallet seed.

==Generating the mnemonic==

The mnemonic must encode entropy in any multiple of 32 bits. With larger entropy
security is improved but the sentence length increases. We can refer to the
initial entropy length as ENT. The recommended size of ENT is 128-256 bits.

First, an initial entropy of ENT bits is generated. A checksum is generated by
taking the first <pre>ENT / 32</pre> bits of its SHA256 hash. This checksum is
appended to the end of the initial entropy. Next, these concatenated bits are
are split into groups of 11 bits, each encoding a number from 0-2047, serving
as an index to a wordlist. Later, we will convert these numbers into words and
use the joined words as a mnemonic sentence.

The following table describes the relation between the initial entropy
length (ENT), the checksum length (CS) and length of the generated mnemonic
sentence (MS) in words.

<pre>
CS = ENT / 32
MS = (ENT + CS) / 11

|  ENT  | CS | ENT+CS |  MS  |
+-------+----+--------+------+
|  128  |  4 |   132  |  12  |
|  160  |  5 |   165  |  15  |
|  192  |  6 |   198  |  18  |
|  224  |  7 |   231  |  21  |
|  256  |  8 |   264  |  24  |
</pre>

==Wordlist==

An ideal wordlist has the following characteristics:

a) smart selection of words
   - wordlist is created in such way that it's enough to type the first four
     letters to unambiguously identify the word

b) similar words avoided
   - word pairs like "build" and "built", "woman" and "women", or "quick" and "quickly"
     not only make remembering the sentence difficult, but are also more error
     prone and more difficult to guess

c) sorted wordlists
   - wordlist is sorted which allows for more efficient lookup of the code words
     (i.e. implementation can use binary search instead of linear search)
   - this also allows trie (prefix tree) to be used, e.g. for better compression

The wordlist can contain native characters, but they have to be encoded in UTF-8
using Normalization Form Compatibility Decomposition (NFKD).

==From mnemonic to seed==

A user may decide to protect their mnemonic by passphrase. If a passphrase is not
present, an empty string "" is used instead.

To create a binary seed from the mnemonic, we use PBKDF2 function with a mnemonic
sentence (in UTF-8 NFKD) used as a password and string "mnemonic" + passphrase (again
in UTF-8 NFKD) used as a salt. Iteration count is set to 2048 and HMAC-SHA512 is used as
a pseudo-random function. Desired length of the derived key is 512 bits (= 64 bytes).

This seed can be later used to generate deterministic wallets using BIP-0032 or
similar methods.

The conversion of the mnemonic sentence to binary seed is completely independent
from generating the sentence. This results in rather simple code; there are no
constraints on sentence structure and clients are free to implement their own
wordlists or even whole sentence generators, allowing for flexibility in wordlists
for typo detection or other purposes.

Although using mnemonic not generated by algorithm described in "Generating the
mnemonic" section is possible, this is not advised and software must compute
checksum of the mnemonic sentence using wordlist and issue a warning if it is
invalid.

Described method also provides plausible deniability, because every passphrase
generates a valid seed (and thus deterministic wallet) but only the correct one
will make the desired wallet available.

==Wordlists==

* [[bip-0039/english.txt|English]]

==Test vectors==

See https://github.com/trezor/python-mnemonic/blob/master/vectors.json

==Reference Implementation==

Reference implementation including wordlists is available from

http://github.com/trezor/python-mnemonic

==Other Implementations==

Objective-C - https://github.com/nybex/NYMnemonic