Merge branch 'minh_changes'

pull/96/head
Minh T. Nguyen 10 years ago
commit 90799be2e3

@ -5,7 +5,7 @@
=== Peer-to-Peer Network Architecture
Bitcoin is structured as a peer-to-peer network architecture on top of the Internet. The term peer-to-peer or P2P means that the computers that participate in the network are peers to each other, they are all equal, there are no "special" nodes and all nodes share the burden of providing network services. The network nodes interconnect in a mesh network with a "flat" topology. There is no "server", no centralized service, and no hierarchy within the network. Nodes in a peer-to-peer network both provide and consume services at the same time, with reciprocity acting as the incentive for participation. Peer-to-peer networks are inherently resilient, de-centralized, and open. The pre-eminent example of a P2P network architecture was the early Internet itself, where nodes on the IP network were equal. Today's Internet architecture is more hierarchical, but the Internet Protocol still retains its flat-topology essence. Beyond bitcoin, the largest and most successful application of P2P technologies is file sharing, with Napster as the pioneer and bittorrent as the most recent evolution of the architecture.
Bitcoin is structured as a peer-to-peer network architecture on top of the Internet. The term peer-to-peer or P2P means that the computers that participate in the network are peers to each other, that they are all equal, that there are no "special" nodes and that all nodes share the burden of providing network services. The network nodes interconnect in a mesh network with a "flat" topology. There is no "server", no centralized service, and no hierarchy within the network. Nodes in a peer-to-peer network both provide and consume services at the same time with reciprocity acting as the incentive for participation. Peer-to-peer networks are inherently resilient, de-centralized, and open. The pre-eminent example of a P2P network architecture was the early Internet itself, where nodes on the IP network were equal. Today's Internet architecture is more hierarchical, but the Internet Protocol still retains its flat-topology essence. Beyond bitcoin, the largest and most successful application of P2P technologies is file sharing with Napster as the pioneer and bittorrent as the most recent evolution of the architecture.
Bitcoin's P2P network architecture is much more than a topology choice. Bitcoin is a peer-to-peer digital cash system by design, and the network architecture is both a reflection and a foundation of that core characteristic. De-centralization of control is a core design principle and that can only be achieved and maintained by a flat, de-centralized P2P consensus network.
@ -13,13 +13,13 @@ The term "bitcoin network" refers to the collection of nodes running the bitcoin
=== Nodes Types and Roles
While nodes in the bitcoin P2P network are equal, they may take on different "roles", depending on the functionality they are supporting. A bitcoin node is a collection of functions: routing, the blockchain database, mining, and wallet services. A full node with all four of these functions is shown below:
While nodes in the bitcoin P2P network are equal, they may take on different "roles" depending on the functionality they are supporting. A bitcoin node is a collection of functions: routing, the blockchain database, mining, and wallet services. A full node with all four of these functions is shown below:
[[full_node_reference]]
.A bitcoin network node with all four functions: Network routing, Blockchain database, Mining and Wallet
.A bitcoin network node with all four functions: Wallet, Miner, full Blockchain database, and Network routing
image::images/FullNodeReferenceClient_Small.png["FullNodeReferenceClient_Small"]
All nodes include the routing function to participate in the network and may include other functionality. All nodes validate and propagate transactions and blocks, discover and maintain connections to peers. In the full node example above, the routing function is indicated by an orange circle named "Network Routing Node".
All nodes include the routing function to participate in the network and may include other functionality. All nodes validate and propagate transactions and blocks, and discover and maintain connections to peers. In the full node example above, the routing function is indicated by an orange circle named "Network Routing Node".
Some nodes, called full nodes, also maintain a complete and up-to-date copy of the blockchain. Full nodes can autonomously and authoritatively verify any transaction without external reference. Some nodes maintain only a subset of the blockchain and verify transactions using a method called _Simple Payment Verification_ or SPV. These nodes are known as SPV or Lightweight nodes. In the full node example above, the full node blockchain database function is indicated by a blue circle named "Full Blockchain". SPV nodes are drawn without the blue circle, showing that they do not have a full copy of the blockchain.
@ -53,12 +53,12 @@ When a new node boots up, it must discover other bitcoin nodes on the network in
To connect to a known peer, nodes establish a TCP connection, usually to port 8333 (the bitcoin "well known" port), or an alternative port if one is provided. Upon establishing a connection, the node will start a "handshake" by transmitting a +version+ message, which contains basic identifying information, including:
* PROTOCOL_VERSION, a constant that defines the bitcoin P2P protocol version the client "speaks". E.g. 70002
* PROTOCOL_VERSION, a constant that defines the bitcoin P2P protocol version the client "speaks" (e.g. 70002)
* nLocalServices, a list of local services supported by the node, currently just NODE_NETWORK
* nTime, the current time
* addrYou, the IP address of the remote node as seen from this node
* addrMe, the IP address of the local node, as discovered by the local node
* subver, a sub-version showing the type of software running on this node, e.g. "/Satoshi:0.9.2.1/"
* subver, a sub-version showing the type of software running on this node (e.g. "/Satoshi:0.9.2.1/“)
* BestHeight, the block height of this node's blockchain
(See https://github.com/bitcoin/bitcoin/blob/d3cb2b8acfce36d359262b4afd7e7235eff106b0/src/net.cpp#L562 for an example of the +version+ network message)
@ -125,9 +125,9 @@ If there is no traffic on a connection, nodes will periodically send a message t
=== Full Nodes
Full nodes are nodes that maintain a full blockchain, with all transactions. More accurately they probably should be called "full blockchain nodes". In the early years of bitcoin, all nodes were full nodes and currently the Bitcoin Core client is a full blockchain node. In the last two years, however, new forms of bitcoin clients have been introduced that do not maintain a full blockchain but run as lightweight clients. These are examined in more detail in the next section.
Full nodes are nodes that maintain a full blockchain with all transactions. More accurately they probably should be called "full blockchain nodes". In the early years of bitcoin, all nodes were full nodes and currently the Bitcoin Core client is a full blockchain node. In the last two years, however, new forms of bitcoin clients have been introduced that do not maintain a full blockchain but run as lightweight clients. These are examined in more detail in the next section.
Full blockchain nodes maintain a complete and up-to-date copy of the bitcoin blockchain, with all the transactions, which they independently build and verify, starting with the very first block (genesis block) and building up to the latest known block in the network. A full blockchain node can independently and authoritatively verify any transaction, without recourse or reliance on any other node or source of information. The full blockchain node relies on the network to receive updates about new blocks of transactions, which it then verifies and incorporates into its local copy of the blockchain.
Full blockchain nodes maintain a complete and up-to-date copy of the bitcoin blockchain with all the transactions, which they independently build and verify, starting with the very first block (genesis block) and building up to the latest known block in the network. A full blockchain node can independently and authoritatively verify any transaction without recourse or reliance on any other node or source of information. The full blockchain node relies on the network to receive updates about new blocks of transactions, which it then verifies and incorporates into its local copy of the blockchain.
Running a full blockchain node gives you the pure bitcoin experience: independent verification of all transactions without the need to rely on, or trust, any other systems. It's easy to tell if you're running a full node because it requires several gigabytes of persistent storage (disk space) to store the full blockchain. If you need a lot of disk and it takes 2-3 days to "sync" to the network, you are running a full node. That is the price of complete independence and freedom from central authority.
@ -135,13 +135,13 @@ There are a few alternative implementations of full-blockchain bitcoin clients,
=== Exchanging "Inventory"
The first thing a full node will do once it connects to peers is try to construct a complete blockchain. If it is a brand-new node and has no blockchain at all, then it only knows one block (the genesis block), which is statically embedded in the client software. Starting with block #0, the genesis block, the new node will have to download hundreds of thousands of blocks to synchronize with the network and establish a full blockchain.
The first thing a full node will do once it connects to peers is try to construct a complete blockchain. If it is a brand-new node and has no blockchain at all, then it only knows one block (the genesis block), which is statically embedded in the client software. Starting with block #0, the genesis block, the new node will have to download hundreds of thousands of blocks to synchronize with the network and re-establish the full blockchain.
The process of "syncing" the blockchain starts with the +version+ message, as that contains +BestHeight+, a node's current blockchain height (number of blocks). A node will see the +version+ messages from its peers, know how many blocks they each have and be able to compare to how many blocks it has in its own blockchain. Peered nodes will exchange a +getblocks+ message that contains the hash (fingerprint) of the top block on their local blockchain. One of the peers will be able to identify the received hash as belonging to a block that is not at the top, but rather belongs to an older block, thus deducing that its own local blockchain is longer than its peer's.
The peer that has the longer blockchain has more blocks than the other node and can identify which blocks the other node needs in order to "catch up". It will identify the first 500 blocks to share and transmit their hashes using an +inv+ (inventory) message. The node missing these blocks will then retrieve them, by issuing a series of +getdata+ messages requesting the full block data and identifying the requested blocks using the hashes from the +inv+ message.
Let's assume for example that a node only has the genesis block. It will then receive an +inv+ message from its peers containing the hashes of the next 500 blocks in the chain. It will start requesting blocks from all of its connected peers, spreading the load and ensuring that it doesn't overwhelm any peer with requests. The node keeps track of how many blocks are "in transit" per peer connection, meaning blocks that it has requested but not received, checking that it does not exceed a limit (MAX_BLOCKS_IN_TRANSIT_PER_PEER). This way, if it needs a lot of blocks, it will only request new ones as previous requests are fulfilled, allowing the peers to control the pace of updates and not overwhelming the network. As each block is received, it is added to the blockchain as we will see in the next chapter <<blockchain>>. The local blockchain is gradually built up, more blocks are requested and received and the process continues until the node catches up to the rest of the network.
Let's assume for example that a node only has the genesis block. It will then receive an +inv+ message from its peers containing the hashes of the next 500 blocks in the chain. It will start requesting blocks from all of its connected peers, spreading the load and ensuring that it doesn't overwhelm any peer with requests. The node keeps track of how many blocks are "in transit" per peer connection, meaning blocks that it has requested but not received, checking that it does not exceed a limit (MAX_BLOCKS_IN_TRANSIT_PER_PEER). This way, if it needs a lot of blocks, it will only request new ones as previous requests are fulfilled, allowing the peers to control the pace of updates and not overwhelming the network. As each block is received, it is added to the blockchain as we will see in the next chapter <<blockchain>>. As the local blockchain is gradually built up, more blocks are requested and received and the process continues until the node catches up to the rest of the network.
This process of comparing the local blockchain with the peers and retrieving any missing blocks happens any time a node goes offline for any period of time. Whether a node has been offline for a few minutes and is missing a few blocks, or a month and is missing a few thousand blocks, it starts by sending +getblocks+, gets an +inv+ response, and starts downloading the missing blocks.
@ -184,7 +184,7 @@ Shortly after the introduction of SPV/lightweight nodes, the bitcoin developers
A bloom filter is a probabilistic search filter, a way to describe a desired pattern without specifying it exactly. Bloom filters offer an efficient way to express a search pattern while protecting privacy. They are used by SPV nodes to ask their peers for transactions matching a specific pattern, without revealing exactly which addresses they are searching for.
Using our previous analogy of a tourist without a map asking for directions to a specific address "23 Church St". If they ask strangers for directions to this street, they inadvertently reveal their destination. A bloom filter is like asking "Are there any streets in this neighborhood whose name ends in R-C-H". A question like that reveals slightly less about the desired destination, than asking for "23 Church St". Using this technique, a tourist could specify the desired address in more detail as "ending in U-R-C-H" or less detail as "ending in H". By varying the precision of the search, the tourist reveals more or less information, at the expense of getting more or less specific results. If they ask a less specific pattern, they get a lot more possible addresses and better privacy but many of the results are irrelevant. If they ask for a very specific pattern then they get fewer results but they lose privacy.
In our previous analogy, a tourist without a map is asking for directions to a specific address "23 Church St". If they asks strangers for directions to this street, they inadvertently reveal their destination. A bloom filter is like asking "Are there any streets in this neighborhood whose name ends in R-C-H". A question like that reveals slightly less about the desired destination, than asking for "23 Church St". Using this technique, a tourist could specify the desired address in more detail as "ending in U-R-C-H" or less detail as "ending in H". By varying the precision of the search, the tourist reveals more or less information, at the expense of getting more or less specific results. If they ask a less specific pattern, they get a lot more possible addresses and better privacy but many of the results are irrelevant. If they ask for a very specific pattern then they get fewer results but they lose privacy.
Bloom filters serve this function by allowing an SPV node to specify a search pattern for transactions that can be tuned towards precision or privacy. A more specific bloom filter will produce accurate results, but at the expense of revealing what addresses are used in the user's wallet. A less specific bloom filter will produce more data about more transactions, many irrelevant to the node, but will allow the node to maintain better privacy.
@ -260,7 +260,7 @@ Whereas the transaction and orphan pools represent a single node's local perspec
=== Alert Messages
Alert messages are a seldom used function, which is nevertheless implemented in most nodes. Alert messages are bitcoin's "emergency broadcast system", a means by which the core bitcoin developers can send an emergency text message to all bitcoin nodes. This feature is implemented to allow the core developer team to notify all bitcoin users of a serious problem in the bitcoin network, such as a critical bug that requires user action. The alert system has only been used a handful of times, most notably in the Spring of 2013 when a critical database bug caused a multi-block fork to occur in the bitcoin blockchain.
Alert messages are a seldom used function, which is nevertheless implemented in most nodes. Alert messages are bitcoin's "emergency broadcast system", a means by which the core bitcoin developers can send an emergency text message to all bitcoin nodes. This feature is implemented to allow the core developer team to notify all bitcoin users of a serious problem in the bitcoin network, such as a critical bug that requires user action. The alert system has only been used a handful of times, most notably early 2013 when a critical database bug caused a multi-block fork to occur in the bitcoin blockchain.
Alert messages are propagated by the +alert+ message. The alert message contains several fields, including:

@ -12,7 +12,7 @@ While a block has just one parent, it can temporarily have multiple children. Ea
The "previous block hash" field is inside the block header and thereby affects the _current_ block's hash. The child's own identity changes if the parent's identity changes. When the parent is modified in any way, the parent's hash changes. The parent's changed hash necessitates a change in the "previous block hash" pointer of the child. This in turn causes the child's hash to change, which requires a change in the pointer of the grandchild, which in turn changes the grandchild and so on. This cascade effect ensures that once a block has many generations following it, it cannot be changed without forcing a recalculation of all subsequent blocks. Because such a recalculation would require enormous computation, the existence of a long chain of blocks makes the blockchain's deep history immutable, a key feature of bitcoin's security.
One way to think about the blockchain is like layers in a geological formation, or glacier core sample. The surface layers may change with the seasons, or even be blown away before they have time to settle. But once you go a few inches deep, geological layers become more and more stable. By the time you look a few hundred feet down, you are looking at a snapshot of the past that has remained undisturbed for millennia or millions of years. In the blockchain, the most recent few blocks may be revised if there is a chain recalculation due to a fork. The top six blocks are like a few inches of topsoil. But once you go deeper into the blockchain, beyond 6 blocks, blocks are less and less likely to change. After 100 blocks back there is so much stability that the "coinbase" transaction, containing new bitcoin, can be spent. A few thousand blocks back (a month) and the blockchain is settled history. It will never change.
One way to think about the blockchain is like layers in a geological formation, or glacier core sample. The surface layers may change with the seasons, or even be blown away before they have time to settle. But once you go a few inches deep, geological layers become more and more stable. By the time you look a few hundred feet down, you are looking at a snapshot of the past that has remained undisturbed for millennia or millions of years. In the blockchain, the most recent few blocks may be revised if there is a chain recalculation due to a fork. The top six blocks are like a few inches of topsoil. But once you go deeper into the blockchain, beyond 6 blocks, blocks are less and less likely to change. After 100 blocks back there is so much stability that the "coinbase" transaction, the transaction containing newly-mined bitcoins, can be spent. A few thousand blocks back (a month) and the blockchain is settled history. It will never change.
=== Structure of a Block
@ -169,13 +169,13 @@ Since the merkle tree is a binary tree, it needs an even number of leaf nodes. I
.An even number of data elements, by duplicating one data element
image::images/MerkleTreeOdd.png["merkle_tree_odd"]
The same method for constructing a tree from four transactions can be generalized to construct trees of any size. In bitcoin it is common to have several hundred to more than a thousand transactions in a single block, which are summarized in exactly the same way producing just 32-bytes of data from a single merkle root. In the diagram below, you will see a tree built from 16 transactions. Note that while the root looks bigger than the leaf nodes in the diagram, it is the exact same size, just 32 bytes. Whether there is one transaction or a hundred thousand transactions in the block, the merkle root always summarizes them into 32 bytes:
The same method for constructing a tree from four transactions can be generalized to construct trees of any size. In bitcoin it is common to have several hundred to more than a thousand transactions in a single block, which are summarized in exactly the same way producing just 32 bytes of data as the single merkle root. In the diagram below, you will see a tree built from 16 transactions. Note that while the root looks bigger than the leaf nodes in the diagram, it is the exact same size, just 32 bytes. Whether there is one transaction or a hundred thousand transactions in the block, the merkle root always summarizes them into 32 bytes:
[[merkle_tree_large]]
.A Merkle Tree summarizing many data elements
image::images/MerkleTreeLarge.png["merkle_tree_large"]
To prove that a specific transaction is included in a block, a node need only produce +log~2~(N)+ 32-byte hashes, constituting an _authentication path_ or _merkle path_ connecting the specific transaction to the root of the tree. This is especially important as the number of transactions increases, because the base-2 logarithm of the number of transactions increases much more slowly. This allows bitcoin nodes to efficiently produce paths of ten or twelve hashes (320-384 bytes) which can provide proof of a single transaction out of more than a thousand transactions in a megabyte sized block. In the example below, a node can prove that a transaction K is included in the block by producing a merkle path that is only four 32-byte hashes long (128 bytes total). The path consists of the four hashes H~L~, H~IJ~, H~MNOP~ and H~ABCDEFGH~. With those four hashes provided as an authentication path, any node can prove that H~K~ is included in the merkle root by computing four additional pair-wise hashes H~KL~, H~IJKL~ and H~IJKLMNOP~ that lead to the merkle root.
To prove that a specific transaction is included in a block, a node only needs to produce +log~2~(N)+ 32-byte hashes, constituting an _authentication path_ or _merkle path_ connecting the specific transaction to the root of the tree. This is especially important as the number of transactions increases, because the base-2 logarithm of the number of transactions increases much more slowly. This allows bitcoin nodes to efficiently produce paths of ten or twelve hashes (320-384 bytes) which can provide proof of a single transaction out of more than a thousand transactions in a megabyte sized block. In the example below, a node can prove that a transaction K is included in the block by producing a merkle path that is only four 32-byte hashes long (128 bytes total). The path consists of the four hashes H~L~, H~IJ~, H~MNOP~ and H~ABCDEFGH~. With those four hashes provided as an authentication path, any node can prove that H~K~ is included in the merkle root by computing four additional pair-wise hashes H~KL~, H~IJKL~ and H~IJKLMNOP~ that lead to the merkle root.
[[merkle_tree_path]]
.A Merkle Path used to prove inclusion of a data element

@ -16,7 +16,7 @@ In this chapter, we will first examine mining as a monetary supply mechanism and
==== Bitcoin Economics and Currency Creation
Bitcoins are "minted" during the creation of each block at a fixed and diminishing rate. Each block, generated on average every 10 minutes, contains an entirely new bitcoins, created ex nihilo (from nothing). Every 210,000 blocks or approximately every four years the currency issuance rate is decreased by 50%. For the first four years of operation of the network, each block contained 50 new bitcoin. In November of 2012, the new bitcoin issuance rate was decreased to 25 bitcoin per block and it will decrease again to 12.5 bitcoin at block 420,000, which will be mined sometime in 2016. The rate of new coins decreases like this exponentially over 64 "halvings", until block 13,230,000 (mined in year 2137, approximately) when it reaches the minimum currency unit of 1 satoshi. Finally, after 13.44 million blocks, in approximately 2140, all 2,099,999,997,690,000 satoshis, or almost 21 million bitcoin will be issued. Thereafter, blocks will contain no new bitcoin, and miners will be rewarded solely through the transaction fees.
Bitcoins are "minted" during the creation of each block at a fixed and diminishing rate. Each block, generated on average every 10 minutes, contains entirely new bitcoins, created ex nihilo (from nothing). Every 210,000 blocks or approximately every four years the currency issuance rate is decreased by 50%. For the first four years of operation of the network, each block contained 50 new bitcoin. In November of 2012, the new bitcoin issuance rate was decreased to 25 bitcoin per block and it will decrease again to 12.5 bitcoin at block 420,000, which will be mined sometime in 2016. The rate of new coins decreases like this exponentially over 64 "halvings", until block 13,230,000 (mined approximately in year 2137) when it reaches the minimum currency unit of 1 satoshi. Finally, after 13.44 million blocks, in approximately 2140, all 2,099,999,997,690,000 satoshis, or almost 21 million bitcoin will be issued. Thereafter, blocks will contain no new bitcoin, and miners will be rewarded solely through the transaction fees.
[[bitcoin_money_supply]]
.Supply of bitcoin currency over time based on a geometrically decreasing issuance rate
@ -126,7 +126,7 @@ If there is any space remaining in the block, Jing's mining node may choose to f
Any transactions left in the memory pool, after the block is filled, will remain in the pool for inclusion in the next block. As transactions remain in the memory pool, their inputs "age", as the UTXO they spend get deeper into the blockchain with new blocks added on top. Since a transaction's priority depends on the age of its inputs, transactions remaining in the pool will age and therefore increase in priority. Eventually a transaction without fees may reach a high enough priority to be included in the block for free.
Bitcoin transactions do not have an expiration time-out. A transaction that is valid now will be valid in perpetuity. However, if a transaction is only propagated across the network once it will persist only as long as it is held in a mining node memory pool. When a mining node is restarted, its memory pool is wiped clear, as it is a transient non-persistent form of storage. While a valid transaction may have been propagated across the network, if it is not executed it may eventually not reside in the memory pool of any miner. Wallet software is expected to retransmit such transactions or reconstruct them with higher fees if they are not successfully executed within a reasonable amount of time.
Bitcoin transactions do not have an expiration time-out. A transaction that is valid now will be valid in perpetuity. However, if a transaction is only propagated across the network once it will persist only as long as it is held in a mining node memory pool. When a mining node is restarted, its memory pool is wiped clear, as it is a transient non-persistent form of storage. While a valid transaction may have been propagated across the network, if it is not executed it may eventually not reside in the memory pool of any miner. Wallet software are expected to retransmit such transactions or reconstruct them with higher fees if they are not successfully executed within a reasonable amount of time.
When Jing's node aggregates all the transactions from the memory pool, the new candidate block has 418 transactions with total transaction fees of 0.09094928 bitcoin. You can see this block in the blockchain using the Bitcoin Core client command line interface:
====
@ -654,7 +654,7 @@ In the next section we will look at how discrepancies between competing chains (
Because the blockchain is a decentralized data structure, different copies of it are not always consistent. Blocks may arrive at different nodes at different times, causing the nodes to have different perspectives of the blockchain. To resolve this, each node always selects and attempts to extend the chain of blocks that represents the most Proof-of-Work, also known as the longest chain or greatest cumulative difficulty chain. By summing the difficulty recorded in each block in a chain, a node can calculate the total amount of Proof-of-Work that has been expended to create that chain. As long as all nodes select the longest cumulative difficulty chain, the global bitcoin network eventually converges to a consistent state. Forks occur as temporary inconsistencies between versions of the blockchain, which are resolved by eventual re-convergence as more blocks are added to one of the forks.
In the next few diagrams, we follow the progress of a "fork" event across the network. The diagram is a simplified representation of bitcoin as a global network. In reality, the bitcoin network's topology is not organized geographically. Rather, it forms a mesh network of interconnected nodes, which may be located very far from each other geographically. The representation of a geographic topology is a simplification used for the purposes of illustrating a fork. In the real bitcoin network, the "distance" between nodes is measured in "hops" from node to node, not in terms of their location. For illustration purposes, different blocks are shown as different colors, spreading across the network and coloring the connections they traverse.
In the next few diagrams, we follow the progress of a "fork" event across the network. The diagram is a simplified representation of bitcoin as a global network. In reality, the bitcoin network's topology is not organized geographically. Rather, it forms a mesh network of interconnected nodes, which may be located very far from each other geographically. The representation of a geographic topology is a simplification used for the purposes of illustrating a fork. In the real bitcoin network, the "distance" between nodes is measured in "hops" from node to node, not in terms of their physical location. For illustration purposes, different blocks are shown as different colors, spreading across the network and coloring the connections they traverse.
In the first diagram below, the network has a unified perspective of the blockchain, with the blue block as the tip of the main chain.

Loading…
Cancel
Save