Copyediting of chapter 6, part 1

pull/96/head
Minh T. Nguyen 10 years ago
parent 39c3584de9
commit 70f6ae21e5

@ -5,7 +5,7 @@
=== Peer-to-Peer Network Architecture
Bitcoin is structured as a peer-to-peer network architecture on top of the Internet. The term peer-to-peer or P2P means that the computers that participate in the network are peers to each other, they are all equal, there are no "special" nodes and all nodes share the burden of providing network services. The network nodes interconnect in a mesh network with a "flat" topology. There is no "server", no centralized service, and no hierarchy within the network. Nodes in a peer-to-peer network both provide and consume services at the same time, with reciprocity acting as the incentive for participation. Peer-to-peer networks are inherently resilient, de-centralized, and open. The pre-eminent example of a P2P network architecture was the early Internet itself, where nodes on the IP network were equal. Today's Internet architecture is more hierarchical, but the Internet Protocol still retains its flat-topology essence. Beyond bitcoin, the largest and most successful application of P2P technologies is file sharing, with Napster as the pioneer and bittorrent as the most recent evolution of the architecture.
Bitcoin is structured as a peer-to-peer network architecture on top of the Internet. The term peer-to-peer or P2P means that the computers that participate in the network are peers to each other, that they are all equal, that there are no "special" nodes and that all nodes share the burden of providing network services. The network nodes interconnect in a mesh network with a "flat" topology. There is no "server", no centralized service, and no hierarchy within the network. Nodes in a peer-to-peer network both provide and consume services at the same time with reciprocity acting as the incentive for participation. Peer-to-peer networks are inherently resilient, de-centralized, and open. The pre-eminent example of a P2P network architecture was the early Internet itself, where nodes on the IP network were equal. Today's Internet architecture is more hierarchical, but the Internet Protocol still retains its flat-topology essence. Beyond bitcoin, the largest and most successful application of P2P technologies is file sharing with Napster as the pioneer and bittorrent as the most recent evolution of the architecture.
Bitcoin's P2P network architecture is much more than a topology choice. Bitcoin is a peer-to-peer digital cash system by design, and the network architecture is both a reflection and a foundation of that core characteristic. De-centralization of control is a core design principle and that can only be achieved and maintained by a flat, de-centralized P2P consensus network.
@ -13,13 +13,13 @@ The term "bitcoin network" refers to the collection of nodes running the bitcoin
=== Nodes Types and Roles
While nodes in the bitcoin P2P network are equal, they may take on different "roles", depending on the functionality they are supporting. A bitcoin node is a collection of functions: routing, the blockchain database, mining, and wallet services. A full node with all four of these functions is shown below:
While nodes in the bitcoin P2P network are equal, they may take on different "roles" depending on the functionality they are supporting. A bitcoin node is a collection of functions: routing, the blockchain database, mining, and wallet services. A full node with all four of these functions is shown below:
[[full_node_reference]]
.A bitcoin network node with all four functions: Network routing, Blockchain database, Mining and Wallet
.A bitcoin network node with all four functions: Wallet, Miner, full Blockchain database, and Network routing
image::images/FullNodeReferenceClient_Small.png["FullNodeReferenceClient_Small"]
All nodes include the routing function to participate in the network and may include other functionality. All nodes validate and propagate transactions and blocks, discover and maintain connections to peers. In the full node example above, the routing function is indicated by an orange circle named "Network Routing Node".
All nodes include the routing function to participate in the network and may include other functionality. All nodes validate and propagate transactions and blocks, and discover and maintain connections to peers. In the full node example above, the routing function is indicated by an orange circle named "Network Routing Node".
Some nodes, called full nodes, also maintain a complete and up-to-date copy of the blockchain. Full nodes can autonomously and authoritatively verify any transaction without external reference. Some nodes maintain only a subset of the blockchain and verify transactions using a method called _Simple Payment Verification_ or SPV. These nodes are known as SPV or Lightweight nodes. In the full node example above, the full node blockchain database function is indicated by a blue circle named "Full Blockchain". SPV nodes are drawn without the blue circle, showing that they do not have a full copy of the blockchain.
@ -53,12 +53,12 @@ When a new node boots up, it must discover other bitcoin nodes on the network in
To connect to a known peer, nodes establish a TCP connection, usually to port 8333 (the bitcoin "well known" port), or an alternative port if one is provided. Upon establishing a connection, the node will start a "handshake" by transmitting a +version+ message, which contains basic identifying information, including:
* PROTOCOL_VERSION, a constant that defines the bitcoin P2P protocol version the client "speaks". E.g. 70002
* PROTOCOL_VERSION, a constant that defines the bitcoin P2P protocol version the client "speaks" (e.g. 70002)
* nLocalServices, a list of local services supported by the node, currently just NODE_NETWORK
* nTime, the current time
* addrYou, the IP address of the remote node as seen from this node
* addrMe, the IP address of the local node, as discovered by the local node
* subver, a sub-version showing the type of software running on this node, e.g. "/Satoshi:0.9.2.1/"
* subver, a sub-version showing the type of software running on this node (e.g. "/Satoshi:0.9.2.1/“)
* BestHeight, the block height of this node's blockchain
(See https://github.com/bitcoin/bitcoin/blob/d3cb2b8acfce36d359262b4afd7e7235eff106b0/src/net.cpp#L562 for an example of the +version+ network message)
@ -125,9 +125,9 @@ If there is no traffic on a connection, nodes will periodically send a message t
=== Full Nodes
Full nodes are nodes that maintain a full blockchain, with all transactions. More accurately they probably should be called "full blockchain nodes". In the early years of bitcoin, all nodes were full nodes and currently the Bitcoin Core client is a full blockchain node. In the last two years, however, new forms of bitcoin clients have been introduced that do not maintain a full blockchain but run as lightweight clients. These are examined in more detail in the next section.
Full nodes are nodes that maintain a full blockchain with all transactions. More accurately they probably should be called "full blockchain nodes". In the early years of bitcoin, all nodes were full nodes and currently the Bitcoin Core client is a full blockchain node. In the last two years, however, new forms of bitcoin clients have been introduced that do not maintain a full blockchain but run as lightweight clients. These are examined in more detail in the next section.
Full blockchain nodes maintain a complete and up-to-date copy of the bitcoin blockchain, with all the transactions, which they independently build and verify, starting with the very first block (genesis block) and building up to the latest known block in the network. A full blockchain node can independently and authoritatively verify any transaction, without recourse or reliance on any other node or source of information. The full blockchain node relies on the network to receive updates about new blocks of transactions, which it then verifies and incorporates into its local copy of the blockchain.
Full blockchain nodes maintain a complete and up-to-date copy of the bitcoin blockchain with all the transactions, which they independently build and verify, starting with the very first block (genesis block) and building up to the latest known block in the network. A full blockchain node can independently and authoritatively verify any transaction without recourse or reliance on any other node or source of information. The full blockchain node relies on the network to receive updates about new blocks of transactions, which it then verifies and incorporates into its local copy of the blockchain.
Running a full blockchain node gives you the pure bitcoin experience: independent verification of all transactions without the need to rely on, or trust, any other systems. It's easy to tell if you're running a full node because it requires several gigabytes of persistent storage (disk space) to store the full blockchain. If you need a lot of disk and it takes 2-3 days to "sync" to the network, you are running a full node. That is the price of complete independence and freedom from central authority.
@ -135,13 +135,13 @@ There are a few alternative implementations of full-blockchain bitcoin clients,
=== Exchanging "Inventory"
The first thing a full node will do once it connects to peers is try to construct a complete blockchain. If it is a brand-new node and has no blockchain at all, then it only knows one block (the genesis block), which is statically embedded in the client software. Starting with block #0, the genesis block, the new node will have to download hundreds of thousands of blocks to synchronize with the network and establish a full blockchain.
The first thing a full node will do once it connects to peers is try to construct a complete blockchain. If it is a brand-new node and has no blockchain at all, then it only knows one block (the genesis block), which is statically embedded in the client software. Starting with block #0, the genesis block, the new node will have to download hundreds of thousands of blocks to synchronize with the network and re-establish the full blockchain.
The process of "syncing" the blockchain starts with the +version+ message, as that contains +BestHeight+, a node's current blockchain height (number of blocks). A node will see the +version+ messages from its peers, know how many blocks they each have and be able to compare to how many blocks it has in its own blockchain. Peered nodes will exchange a +getblocks+ message that contains the hash (fingerprint) of the top block on their local blockchain. One of the peers will be able to identify the received hash as belonging to a block that is not at the top, but rather belongs to an older block, thus deducing that its own local blockchain is longer than its peer's.
The peer that has the longer blockchain has more blocks than the other node and can identify which blocks the other node needs in order to "catch up". It will identify the first 500 blocks to share and transmit their hashes using an +inv+ (inventory) message. The node missing these blocks will then retrieve them, by issuing a series of +getdata+ messages requesting the full block data and identifying the requested blocks using the hashes from the +inv+ message.
Let's assume for example that a node only has the genesis block. It will then receive an +inv+ message from its peers containing the hashes of the next 500 blocks in the chain. It will start requesting blocks from all of its connected peers, spreading the load and ensuring that it doesn't overwhelm any peer with requests. The node keeps track of how many blocks are "in transit" per peer connection, meaning blocks that it has requested but not received, checking that it does not exceed a limit (MAX_BLOCKS_IN_TRANSIT_PER_PEER). This way, if it needs a lot of blocks, it will only request new ones as previous requests are fulfilled, allowing the peers to control the pace of updates and not overwhelming the network. As each block is received, it is added to the blockchain as we will see in the next chapter <<blockchain>>. The local blockchain is gradually built up, more blocks are requested and received and the process continues until the node catches up to the rest of the network.
Let's assume for example that a node only has the genesis block. It will then receive an +inv+ message from its peers containing the hashes of the next 500 blocks in the chain. It will start requesting blocks from all of its connected peers, spreading the load and ensuring that it doesn't overwhelm any peer with requests. The node keeps track of how many blocks are "in transit" per peer connection, meaning blocks that it has requested but not received, checking that it does not exceed a limit (MAX_BLOCKS_IN_TRANSIT_PER_PEER). This way, if it needs a lot of blocks, it will only request new ones as previous requests are fulfilled, allowing the peers to control the pace of updates and not overwhelming the network. As each block is received, it is added to the blockchain as we will see in the next chapter <<blockchain>>. As the local blockchain is gradually built up, more blocks are requested and received and the process continues until the node catches up to the rest of the network.
This process of comparing the local blockchain with the peers and retrieving any missing blocks happens any time a node goes offline for any period of time. Whether a node has been offline for a few minutes and is missing a few blocks, or a month and is missing a few thousand blocks, it starts by sending +getblocks+, gets an +inv+ response, and starts downloading the missing blocks.
@ -184,7 +184,7 @@ Shortly after the introduction of SPV/lightweight nodes, the bitcoin developers
A bloom filter is a probabilistic search filter, a way to describe a desired pattern without specifying it exactly. Bloom filters offer an efficient way to express a search pattern while protecting privacy. They are used by SPV nodes to ask their peers for transactions matching a specific pattern, without revealing exactly which addresses they are searching for.
Using our previous analogy of a tourist without a map asking for directions to a specific address "23 Church St". If they ask strangers for directions to this street, they inadvertently reveal their destination. A bloom filter is like asking "Are there any streets in this neighborhood whose name ends in R-C-H". A question like that reveals slightly less about the desired destination, than asking for "23 Church St". Using this technique, a tourist could specify the desired address in more detail as "ending in U-R-C-H" or less detail as "ending in H". By varying the precision of the search, the tourist reveals more or less information, at the expense of getting more or less specific results. If they ask a less specific pattern, they get a lot more possible addresses and better privacy but many of the results are irrelevant. If they ask for a very specific pattern then they get fewer results but they lose privacy.
In our previous analogy, a tourist without a map is asking for directions to a specific address "23 Church St". If they asks strangers for directions to this street, they inadvertently reveal their destination. A bloom filter is like asking "Are there any streets in this neighborhood whose name ends in R-C-H". A question like that reveals slightly less about the desired destination, than asking for "23 Church St". Using this technique, a tourist could specify the desired address in more detail as "ending in U-R-C-H" or less detail as "ending in H". By varying the precision of the search, the tourist reveals more or less information, at the expense of getting more or less specific results. If they ask a less specific pattern, they get a lot more possible addresses and better privacy but many of the results are irrelevant. If they ask for a very specific pattern then they get fewer results but they lose privacy.
Bloom filters serve this function by allowing an SPV node to specify a search pattern for transactions that can be tuned towards precision or privacy. A more specific bloom filter will produce accurate results, but at the expense of revealing what addresses are used in the user's wallet. A less specific bloom filter will produce more data about more transactions, many irrelevant to the node, but will allow the node to maintain better privacy.

Loading…
Cancel
Save