The names of the two methods (which are taken from BIP152) can be a bit
confusing. Low-bandwidth mode saves bandwidth by not sending blocks in
most cases. High-bandwidth mode uses more bandwidth than low-bandwidth
mode but, in most cases, much less bandwidth than was used for block
relay before compact blocks were implemented.
=== Private Block Relay Networks
Although compact blocks go a long way towards minimizing the time it
takes for blocks to propagate across the network,
it's possible to minimize latency further. Unlike
compact blocks, though, the other solutions involve tradeoffs that
make them unavailable or unsuitable for the public P2P relay network.
For that reason, there has been experimentation with private relay
networks for blocks.
One simple technique is to pre-select a route between endpoints. For
example, a relay network with servers running in datacenters near major
trans-oceanic fiber optic lines might be able to forward new blocks
faster than waiting for the block to arrive at the node run by some home
user many kilometers away from the fiber optic line.
Another, more complex technique, is Forward Error Correction (FEC).
This allows a compact block message to be split into several parts, with
each part having extra data appended. If any of the parts isn't
received, that part can be reconstructed from the parts that are
received. Depending on the settings, up to several parts may be
reconstructed if they are lost.
FEC avoids the problem of a compact block (or some parts of it) not
arriving due to problems with the underlying network connection.
Those problems frequently occur but we don't often notice them
because we mostly use protocols that automatically re-request the
missing data. However, requesting missing data triples the time to
receive it. For example:
1. Alice sends some data to Bob
2. Bob doesn't receive the data (or it is damaged). Bob re-requests
the data from Alice
3. Alice sends the data again
A third technique is to assume all nodes receiving the data have
almost all of the same transactions in their mempool, so they can all
accept the same compact block. That not only saves us time computing
a compact block at each hop but it means that all each hop can simply
relay the FEC packets to the next hop even before validating them.
The tradeoff for each of the above methods is that they work well with
centralization but not in a decentralized network where individual nodes
can't trust other nodes. Servers in datacenters cost money and can
often be accessed by operators of the datacenter, making them less
trustworthy than a secure home computer. Relaying data before
validating makes it easy to waste bandwidth, so it can only reasonably
be used on a private network where there's some level of trust and
accountability between parties.
The original
https://www.bitcoinrelaynetwork.org[Bitcoin Relay Network] was created by
developer Matt Corallo in 2015 to enable fast synchronization of
blocks between miners with very low latency. The network consisted of
several Virtual Private Servers (VPSes) hosted on
infrastructure around the world and served to connect the majority of
miners and mining pools.
The original Bitcoin Relay Network was replaced in 2016
with the introduction of the _Fast Internet Bitcoin Relay Engine_ or
https://bitcoinfibre.org[_FIBRE_], also created by developer Matt
Corallo. FIBRE is software that allows operating a UDP-based relay network that relays blocks within a
network of nodes. FIBRE implements FEC and the _compact block_ optimization to
further reduce the amount of data transmitted and the network latency.
=== Network Discovery
When a new node boots up, it must discover other
Bitcoin nodes on the network in order to participate. To start this
process, a new node must discover at least one existing node on the
network and connect to it. The geographic location of other nodes is
irrelevant; the Bitcoin network topology is not geographically defined.
Therefore, any existing Bitcoin nodes can be selected at random.
To connect to a known peer, nodes establish a TCP connection, usually to
port 8333 (the port generally known as the one used by Bitcoin), or an
alternative port if one is provided. Upon establishing a connection, the
node will start a "handshake" (see <<network_handshake>>) by
transmitting a +version+ message, which contains basic identifying
information, including:
+Version+:: The Bitcoin P2P protocol version the client "speaks" (e.g., 70002)
+nLocalServices+:: A list of local services supported by the node
+nTime+:: The current time
+addrYou+:: The IP address of the remote node as seen from this node
+addrMe+:: The IP address of the local node, as discovered by the local node
+subver+:: A sub-version showing the type of software running on this node (e.g., pass:[<span class="keep-together"><code>/Satoshi:0.9.2.1/</code></span>])
+BestHeight+:: The block height of this node's blockchain
+fRelay+:: A field added by BIP37 for requesting not to receive unconfirmed transactions
The +version+ message is always the first message sent by any peer to
another peer. The local peer receiving a +version+ message will examine
the remote peer's reported +Version+ and decide if the remote peer is
compatible. If the remote peer is compatible, the local peer will
acknowledge the +version+ message and establish a connection by sending
a +verack+.
How does a new node find peers? The first method is to query DNS using a
number of _DNS seeds_, which are DNS servers that provide a list of IP
addresses of Bitcoin nodes. Some of those DNS seeds provide a static
list of IP addresses of stable Bitcoin listening nodes. Some of the DNS
seeds are custom implementations of BIND (Berkeley Internet Name Daemon)
that return a random subset from a list of Bitcoin node addresses
collected by a crawler or a long-running Bitcoin node. The Bitcoin Core
client contains the names of several different DNS seeds. The diversity of
ownership and diversity of implementation of the different DNS seeds
offers a high level of reliability for the initial bootstrapping
process. In the Bitcoin Core client, the option to use the DNS seeds is
controlled by the option switch +-dnsseed+ (set to 1 by default, to use
the DNS seed).
Alternatively, a bootstrapping node that knows nothing of the network
must be given the IP address of at least one Bitcoin node, after which
it can establish connections through further introductions. The
command-line argument +-seednode+ can be used to connect to one node
just for introductions using it as a seed. After the initial seed node
is used to form introductions, the client will disconnect from it and
A node must connect to a few different peers in order to establish
diverse paths into the Bitcoin network. Paths are not reliable—nodes
come and go—and so the node must continue to discover new nodes as it
loses old connections as well as assist other nodes when they bootstrap.
Only one connection is needed to bootstrap, because the first node can
offer introductions to its peer nodes and those peers can offer further
introductions. It's also unnecessary and wasteful of network resources
to connect to more than a handful of nodes. After bootstrapping, a node
will remember its most recent successful peer connections, so that if it
is rebooted it can quickly reestablish connections with its former peer
network. If none of the former peers respond to its connection request,
the node can use the seed nodes to bootstrap again.
On a node running the Bitcoin Core client, you can list the peer
connections with the command +getpeerinfo+:
[source,bash]
----
$ bitcoin-cli getpeerinfo
----
[source,json]
----
[
{
"id": 0,
"addr": "82.64.116.5:8333",
"addrbind": "192.168.0.133:50564",
"addrlocal": "72.253.6.11:50564",
"network": "ipv4",
"services": "0000000000000409",
"servicesnames": [
"NETWORK",
"WITNESS",
"NETWORK_LIMITED"
],
"lastsend": 1683829947,
"lastrecv": 1683829989,
"last_transaction": 0,
"last_block": 1683829989,
"bytessent": 3558504,
"bytesrecv": 6016081,
"conntime": 1683647841,
"timeoffset": 0,
"pingtime": 0.204744,
"minping": 0.20337,
"version": 70016,
"subver": "/Satoshi:24.0.1/",
"inbound": false,
"bip152_hb_to": true,
"bip152_hb_from": false,
"startingheight": 788954,
"presynced_headers": -1,
"synced_headers": 789281,
"synced_blocks": 789281,
"inflight": [
],
"relaytxes": false,
"minfeefilter": 0.00000000,
"addr_relay_enabled": false,
"addr_processed": 0,
"addr_rate_limited": 0,
"permissions": [
],
"bytessent_per_msg": {
...
},
"bytesrecv_per_msg": {
...
},
"connection_type": "block-relay-only"
},
]
----
To override the automatic management of peers and to specify a list of
IP addresses, users can provide the option +-connect=<IPAddress>+ and
specify one or more IP addresses. If this option is used, the node will
only connect to the selected IP addresses, instead of discovering and
maintaining the peer connections automatically.
If there is no traffic on a connection, nodes will periodically send a
message to maintain the connection. If a node has not communicated on a
connection for too long, it is assumed to be disconnected
and a new peer will be sought. Thus, the network dynamically adjusts to
transient nodes and network problems, and can organically grow and
shrink as needed without any central control.
=== Full Nodes
Full nodes are nodes that verify every transaction in every block on the
valid blockchain with the most proof of work.
Full nodes
independently process every block, starting after the very first
block (genesis block) and building up to the latest known block in the
network. A full node can independently and authoritatively
verify any transaction.
The full node relies on the network to
receive updates about new blocks of transactions, which it then verifies
and incorporates into its local view of which scripts control which
bitcoins, called the set of _unspent transaction outputs_ (UTXOs).
Running a full node gives
you the pure Bitcoin experience: independent verification of all
transactions without the need to rely on, or trust, any other systems.
There are a few alternative implementations of
full nodes, built using different programming
languages and software architectures, or which made different design
decisions. However, the most common
implementation is Bitcoin Core.
More than 95% of full nodes on the Bitcoin network run
various versions of Bitcoin Core. It is identified as "Satoshi" in the
sub-version string sent in the +version+ message and shown by the
command +getpeerinfo+ as we saw earlier; for example, +/Satoshi:24.0.1/+.
=== Exchanging "Inventory"
The first thing a full
node will do once it connects to peers is try to construct a complete
chain of block headers. If it is a brand-new node and has no blockchain at all, it
only knows one block, the genesis block, which is statically embedded in
the client software. Starting after block #0 (the genesis block), the new
node will have to download hundreds of thousands of blocks to
synchronize with the network and reestablish the full blockchain.
The
process of syncing the blockchain starts with the +version+ message,
because that contains +BestHeight+, a node's current blockchain height
(number of blocks). A node will see the +version+ messages from its
peers, know how many blocks they each have, and be able to compare to
how many blocks it has in its own blockchain. Peered nodes will exchange
a +getheaders+ message that contains the hash of the top
block on their local blockchain. One of the peers will be able to
identify the received hash as belonging to a block that is not at the
top, but rather belongs to an older block, thus deducing that its own
local blockchain is longer than its peer's.
The peer that has the longer blockchain has more blocks than the other
node and can identify which headers the other node needs in order to
"catch up." It will identify the first 2,000 headers to share using a
+headers+ message. The node will keep requesting additional headers
until it has received one for every block the remote peer claims to
have.
In parallel, the node will begin requesting the blocks for each header
it previously received using a +getdata+ message. The node will request
different blocks from each of its selected peers, which allows it to drop
connections to peers that are significantly slower than the average in
order to find newer (and possibly faster) peers.
Let's assume, for example, that a node only has the genesis block. It
will then receive an +headers+ message from its peers containing the headers
of the next 2,000 blocks in the chain. It will start requesting blocks
from all of its connected peers, keeping a queue of up to 1,024 blocks.
Blocks need to be validated in order, so if the oldest block in the
queue--the block the node next needs to validate--hasn't been received
yet, the node drops the connection to the peer that was supposed to
provide that block. It then finds a new peer that may be able to
provide one block before all of the node's other peers are able to
provide 1,023 blocks.
As each block is received, it is added to the
blockchain, as we will see in <<blockchain>>. As the local blockchain is
gradually built up, more blocks are requested and received, and the
process continues until the node catches up to the rest of the network.
This process of comparing the local blockchain with the peers and
retrieving any missing blocks happens any time a node has been offline for
an extended period of time.
[[spv_nodes]]
=== Lightweight Clients
Many Bitcoin clients are designed to run on space- and
power-constrained devices, such as smartphones, tablets, or embedded
systems. For such devices, a _simplified payment verification_ (SPV)
method is used to allow them to operate without validating the full
blockchain. These types of clients are called lightweight
clients.
Lightweight clients download only the block headers and do not download the
transactions included in each block. The resulting chain of headers,
without transactions, is about 10,000 times smaller than the full blockchain.
Lightweight clients cannot construct a full picture of all the UTXOs that are
available for spending because they do not know about all the
transactions on the network. Instead, they verify transactions using a
slightly different method that relies on peers to provide partial views
of relevant parts of the blockchain on demand.
As an analogy, a full node is like a tourist in a strange city, equipped
with a detailed map of every street and every address. By comparison, a
lightweight client is like a tourist in a strange city asking random strangers for
turn-by-turn directions while knowing only one main avenue. Although
both tourists can verify the existence of a street by visiting it, the
tourist without a map doesn't know what lies down any of the side
streets and doesn't know what other streets exist. Positioned in front
of 23 Church Street, the tourist without a map cannot know if there are
a dozen other "23 Church Street" addresses in the city and whether this
is the right one. The mapless tourist's best chance is to ask enough
people and hope some of them are not trying to mug him.
Lightweight clients verify transactions by reference to their _depth_ in the blockchain. Whereas a full node will construct a fully verified chain of thousands of blocks and millions of transactions reaching down the blockchain (back in time) all the way to the genesis block, a lightweight client will verify the proof of work of all blocks (but not whether the blocks and all of their transactions are valid) and link that chain to the transaction of interest.
For example, when examining a transaction in block 800,000, a full node
verifies all 800,000 blocks down to the genesis block and builds a full
database of UTXOs, establishing the validity of the transaction by
confirming that the transaction exists and its output remains unspent. A lightweight client can
only verify that the transaction exists. The client establishes a link
between the transaction and the block that contains it, using a _merkle
path_ (see <<merkle_trees>>). Then, the lightweight client waits until it sees the
six blocks 800,001 through 800,006 piled on top of the block containing
the transaction and verifies it by establishing its depth under blocks
800,006 to 800,001. The fact that other nodes on the network accepted
block 800,000 and that miners did the necessary work to produce six more blocks
on top of it is proof, by proxy, that the transaction actually exists.
A lightweight client cannot normally be persuaded that a transaction exists in a block
when the transaction does not in fact exist. The lightweight client establishes
the existence of a transaction in a block by requesting a merkle path
proof and by validating the Proof-of-Work in the chain of blocks.
However, a transaction's existence can be "hidden" from a lightweight client. A
lightweight client can definitely verify that a transaction exists but cannot
verify that a transaction, such as a double-spend of the same UTXO,
doesn't exist because it doesn't have a record of all transactions. This
vulnerability can be used in a denial-of-service attack or for a
double-spending attack against lightweight clients. To defend against this, a lightweight
client needs to connect randomly to several clients, to increase the
probability that it is in contact with at least one honest node. This
need to randomly connect means that lightweight clients also are vulnerable to
network partitioning attacks or Sybil attacks, where they are connected
to fake nodes or fake networks and do not have access to honest nodes or
the real Bitcoin network.
For many practical purposes, well-connected lightweight clients are secure enough,
striking a balance between resource needs, practicality, and security.
For infallible security, however, nothing beats running a full
node.
[TIP]
====
A full node verifies a transaction by checking the entire chain of
thousands of blocks below it in order to guarantee that the UTXO exists
and is not spent, whereas an lightweight client only proves that a transaction
exists and checks that the block containing that transaction is
buried by a handful of blocks above it.
====
To get the block headers it needs to verify a transaction is part of the
chain, lightweight clients use a +getheaders+ message.
The responding peer will send up to 2,000 block headers
using a single +headers+ message. See the illustration in
<<spv_synchronization>>.
[[spv_synchronization]]
.Lightweight client synchronizing the block headers