1
0
mirror of https://github.com/bitcoinbook/bitcoinbook synced 2024-12-23 15:18:11 +00:00
bitcoinbook/appb_errata.adoc

244 lines
11 KiB
Plaintext
Raw Normal View History

[appendix]
== Errata to the Bitcoin Whitepaper
A description of known problems in Satoshi Nakamotos paper, "Bitcoin:
A Peer-to-Peer Electronic Cash System", as well as notes on terminology
changes and how Bitcoin's implementation differs from that described in
the paper.
This document was originally published by a co-author of this book in
2016; it is reproduced here with updates. The names of
sections in this errata correspond to the names of the
sections in Nakamoto's original paper.
=== Abstract
____
"The longest chain not only serves as proof of the sequence of events
witnessed, but proof that it came from the largest pool of CPU power."
____
* *Implementation detail:* If each link in the chain (called "blocks"
CH12-14: edits for Murchandamus feedback - Describe topological order to help readers understand how that solves the double spend problem - Mention that transactions can be safely relayed one block before their locktime allows them to be included in a block because they'll be valid next block. - Be a bit clearer about when subsidy becomes zero due to rounding and when BIP42 makes it zero unconditionally. - Describe the creation of the witness merkle root before the block header merkle root - Move up note about the retarget off-by-one bug - Make "best blockchain" an inherent property and not an alias for a current chain. When a new block arrives that triggers a reorg, we don't switch best blockchains---the new chain is the best blockchain and we switch to using it. - Combine two sections about forks that were repetitive - Mention that pool miners also need to prove they paid the pool's preferred coinbase transaction template - Add a todo to clarify terminology around the 51% attack. The existing text used this in a way that's consistent with how it was used in early Bitcoin history, but it's potentially confusing because it doesn't actually require a majority of hashrate to make the attack and it confuses it with a censorship attack that does require a majority (or at least a selfishing mining minority) to maintain. - Reduce the situations we describe as "double spends". Consensus prevents double spending within a valid chain; the other situations are about unconfirmed transactions, which might better be described using other terms that won't confuse readers into thinking Bitcoin's double spend protection doesn't work. - Add warning about backing up more than your seed when multisig or complex contracts are in use. - Add a todo to fix "millibits" situation, which might involve image changes. - Note that the first commitment transaction (the refund) needs to be signed before the funding transaction in LN channels. - Pluralize "bitcoin" as necessary (somehow missed this before). - Drop mention of tumblebit and teechan, which nobody is working on AFAIK.
2023-08-05 01:16:06 +00:00
in Bitcoin) was built using the same amount of Proof-of-Work (POW), the
longest chain would be the one backed by the largest pool of
computational power. However, Bitcoin was implemented in such a way that
the amount of POW can vary between blocks, so it became important not to
check for the "the longest chain" but rather "the chain demonstrating
the most POW"; this is often shortened to "most-work chain".
+
The
https://github.com/bitcoin/bitcoin/commit/40cd0369419323f8d7385950e20342e998c994e1#diff-623e3fd6da1a45222eeec71496747b31R420[change]
from checking for the longest chain to checking for the most-work chain
occurred in July 2010, long after Bitcoins initial release:
+
[source,diff]
----
- if (pindexNew->nHeight > nBestHeight)
+ if (pindexNew->bnChainWork > bnBestChainWork)
----
* *Terminology change:* General CPUs were used to generate the POW for
the earliest Bitcoin blocks but POW generation today is mostly performed
by specialist Application Specific Integrated Circuits (ASICs), so
instead of saying "CPU power" it is perhaps more correct to say
"computational power" or, simply, "hash rate" for the hashing used
in generating the POW.
____
"As long as a majority of CPU power is controlled by nodes that are not
cooperating to attack the network, theyll generate the longest chain
and outpace attackers."
____
* *Terminology change:* The term "nodes" today is used to refer to
full validation nodes, which are programs that enforce all the rules of
the system. Programs (and hardware) that extend the chain today are
called "miners" based on Nakamotos analogy to gold miners in section
6 of the paper. Nakamoto expected all miners to be nodes but the
software he released did not require all nodes to be miners. In the the
original software, a simple menu item in the node GUI allowed toggling
the mining function or or off.
+
Today it is the case that the overwhelming number of nodes are not
miners and that many individuals who own mining hardware do not use it
with their own nodes (and even those that do mine with their own nodes
often mine for short periods of time on top of newly discovered blocks
without ensuring their node considers the new block valid). The early
parts of the paper where "nodes" is mostly used without modification
refer to mining using a full validation node; the later parts of the
paper which refer to "network nodes" is mainly about what nodes can do
even if they arent mining.
* *Post-publication discovery:* When a new block is produced, the miner
who produces that block can begin working on its sequel immediately but
all other miners are unaware of the new block and cannot begin working
on it until it has propagated across the
network to them. This gives miners who produce many blocks an edge over
miners who produce fewer blocks, and this can be exploited in whats
known as the _selfish mining attack_ to allow an attacker with around
30% of total network hash rate to make other miners less profitable,
perhaps driving them into following the attacking miners policy. So
instead of saying "a majority of CPU power is controlled by nodes that
are not cooperating to attack the network" it is perhaps more correct
to say "as long as nodes cooperating to attack the network control less
than about 30% of the network".
2023-08-10 02:47:44 +00:00
=== Transactions
____
"We define an electronic coin as a chain of digital signatures. Each
owner transfers the coin to the next by digitally signing a hash of the
previous transaction and the public key of the next owner and adding
these to the end of the coin."
____
* *Implementation detail:* Bitcoin implements a more general version of
this system where digital signatures are not used directly but rather a
"deterministic expression" is used instead. Just as a signature that
matches a known public key can be used to enable a payment, the data
that satisfies an known expression can also enable a payment.
Generically, the expression that must be satisfied in Bitcoin in order
to spend a coin is known as an "encumbrance". Almost all encumbrances
in Bitcoin to date require providing at least one signature. So instead
of saying "a chain of digital signatures" it is more correct to say
"a chain of encumbrances". Given that transactions often have more
than one input and more than one output, the structure is not very
chain-like; its more accurately described as a directed acyclic graph
(DAG).
2023-08-10 02:47:44 +00:00
=== Proof-of-Work
____
"we implement the proof-of-work by incrementing a nonce in the block
until a value is found that gives the blocks hash the required zero
bits."
____
* *Implementation detail:* Adam Backs Hashcash implementation requires
finding a hash with the required number of leading zero bits. Bitcoin
treats the hash as an integer and requires that it be less than a
specified integer, which effectively allows a fractional number of bits
to be specified.
____
"Proof-of-work is essentially one-CPU-one-vote."
____
* *Important note:* the vote here is not on the rules of the system but
merely on the ordering of the transactions in order to provide
assurances that an "electronic coin" cannot be easily double spent.
This is described in more detail in section 11 of the paper where it
says, "We consider the scenario of an attacker trying to generate an
alternate chain faster than the honest chain. Even if this is
accomplished, it does not throw the system open to arbitrary changes,
such as creating value out of thin air or taking money that never
belonged to the attacker. Nodes are not going to accept an invalid
transaction as payment, and honest nodes will never accept a block
containing them."
____
"proof-of-work difficulty is determined by a moving average targeting an
average number of blocks per hour."
____
* *Implementation detail:* A moving average is not used. Instead, every
2,016th block has its reported generation time compared to the
generation time for an earlier block, and the difference between them is
used to calculate the average used for adjustment.
+
Further, the average implemented in Bitcoin targets an average number of
blocks per two weeks (not per hour as might be implied by the text).
Other implemented rules may further slow adjustments, such as a rule
that the adjustment can not increase block production speed by more than
300% per period, nor slow it by more than 75%.
2023-08-10 02:47:44 +00:00
=== Reclaiming Disk Space
____
"Once the latest transaction in a coin is buried under enough blocks, the
spent transactions before it can be discarded to save disk space"
____
* *Possible post-publication discovery:* Although the Merkle Tree
structure described in this section can prove a transaction was included
in a particular block, there is currently no way in Bitcoin to prove
that a transaction has not been spent except to process all subsequent
data in the blockchain. This means the method described here cannot be
universally used for reclaiming disk space among all nodes, as all new
nodes will need to process all transactions.
2023-08-10 02:47:44 +00:00
=== Simplified Payment Verification
____
"One strategy to protect against this would be to accept alerts from
network nodes when they detect an invalid block, prompting the users
software to download the full block and alerted transactions to confirm
the inconsistency."
____
* *Important Note:* although software has been produced that implements
some parts of this section and calls that Simplified Payment
Verification (SPV), none of these programs currently accepts alerts from
network nodes (full validation nodes) when invalid blocks have been
detected. This has placed bitcoins in so-called SPV wallets at risk in
the past.
2023-08-10 02:47:44 +00:00
=== Privacy
____
"Some linking is still unavoidable with multi-input transactions, which
necessarily reveal that their inputs were owned by the same owner"
____
* *Post-publication invention:* it isn't clear that different inputs
in the same transaction have the same owner if if owners often mix their
inputs with
inputs belonging to other owners. For example, theres no public
difference between Alice and Bob each contributing one of their inputs
towards paying Charlie and Dan than there is between just Alice
contributing two of her inputs towards paying Charlie and Dan.
+
This technique is known today as
https://en.bitcoin.it/wiki/CoinJoin[CoinJoin] and software implementing
it has been in use since 2015.
2023-08-10 02:47:44 +00:00
=== Calculations
____
"The receiver generates a new key pair and gives the public key to the
sender shortly before signing. This prevents the sender from preparing a
chain of blocks ahead of time by working on it continuously until he is
lucky enough to get far enough ahead, then executing the transaction at
that moment."
____
* *Post-publication discovery:* nothing about the receiver generating a
public key shortly before the spender signs a transaction prevents the
spender from preparing a chain of blocks ahead of time. Early Bitcoin
user Hal Finney discovered this attack and
https://bitcointalk.org/index.php?topic=3441.msg48384#msg48384[described
it]: "Suppose the attacker is generating blocks occasionally. in each
block he generates, he includes a transfer from address A to address B,
both of which he controls.
+
"To cheat you, when he generates a block, he doesnt broadcast it.
Instead, he runs down to your store and makes a payment to your address
C with his address A. You wait a few seconds, dont hear anything, and
transfer the goods. He broadcasts his block now, and his transaction
will take precedence over yours."
+
The attack works for any number of confirmations, and is sometimes named
the Finney Attack.
'''''
*Disclaimer:* the author of this document was not the first person to
identify any of the problems described here—he has merely collected them
into a single document.
*License:* this errata document is released under the
https://creativecommons.org/publicdomain/zero/1.0/[CC0] 1.0 Universal
Public Domain Dedication
For updates made after the publication of this book, please see the
https://gist.github.com/harding/dabea3d83c695e6b937bf090eddf2bb3[Original
document].