Resolve merge conflicts in chapter 5

pull/83/head
Minh T. Nguyen 10 years ago
commit 83f0f8aa5c

@ -6,7 +6,7 @@
Transactions are the most important part of the bitcoin system. Everything else in bitcoin is designed to ensure that transactions can be created, propagated on the network, validated, and finally added to the global ledger of transactions, the blockchain. Transactions are data structures that encode the transfer of value between participants in the bitcoin system. Each transaction is an public entry in bitcoin's global double-entry bookkeeping ledger, the blockchain.
In this chapter we will examine all the various forms of transactions, what they contain, how to create them, how they are verified, and become part of the permanent record of all transactions.
In this chapter we will examine all the various forms of transactions, what do they contain, how to create them, how they are verified, and how they become part of the permanent record of all transactions.
[[tx_lifecycle]]
=== Transaction Lifecycle
@ -23,12 +23,12 @@ Once a transaction has been created, it is signed by the owner (or owners) of th
[[tx_bcast]]
==== Broadcasting Transactions to the Bitcoin Network
Next, a transaction needs to be delivered to the bitcoin network so that it can be propagated and be included in the blockchain. In essence, a bitcoin transaction is just 300-400 bytes of data and has to reach any one of tens of thousands of bitcoin nodes. The sender does not need to trust the nodes they use to broadcast the transaction, as long as they use more than one to ensure that it propagates. The nodes don't need to trust the sender or establish the sender's "identity". Since the transaction is signed and contains no confidential information, private keys or credentials, it can be publicly broadcast using any underlying network transport that is convenient. Unlike credit card transactions, for example, which contain sensitive information and can only be transmitted on encrypted networks, a bitcoin transaction can be sent over any network. As long as the transaction can reach a bitcoin node that will propagate it into the bitcoin network, it doesn't matter how it is transported to the first node. Bitcoin transactions can therefore be transmitted to the bitcoin network over insecure networks such as Wifi, Bluetooth, NFC, Chirp, barcodes or by copy and paste in a web form. In extreme cases, a bitcoin transaction could be transmitted over packet radio, satellite relay or shortwave using burst transmission, spread spectrum or frequency hoping to evade detection and jamming. A bitcoin transaction could even be encoded as smileys (emoticons) and posted in a public forum or sent as a text message or Skype chat message. Bitcoin has turned money into a data structure making it virtually impossible to stop anyone from creating and executing a bitcoin transaction.
First, a transaction needs to be delivered to the bitcoin network so that it can be propagated and be included in the blockchain. In essence, a bitcoin transaction is just 300-400 bytes of data and has to reach any one of tens of thousands of bitcoin nodes. The sender does not need to trust the nodes they use to broadcast the transaction, as long as they use more than one to ensure that it propagates. The nodes don't need to trust the sender or establish the sender's "identity". Since the transaction is signed and contains no confidential information, private keys or credentials, it can be publicly broadcast using any underlying network transport that is convenient. Unlike credit card transactions, for example, which contain sensitive information and can only be transmitted on encrypted networks, a bitcoin transaction can be sent over any network. As long as the transaction can reach a bitcoin node that will propagate it into the bitcoin network, it doesn't matter how it is transported to the first node. Bitcoin transactions can therefore be transmitted to the bitcoin network over insecure networks such as Wifi, Bluetooth, NFC, Chirp, barcodes or by copying and pasting into a web form. In extreme cases, a bitcoin transaction could be transmitted over packet radio, satellite relay or shortwave using burst transmission, spread spectrum or frequency hoping to evade detection and jamming. A bitcoin transaction could even be encoded as smileys (emoticons) and posted in a public forum or sent as a text message or Skype chat message. Bitcoin has turned money into a data structure, making it virtually impossible to stop anyone from creating and executing a bitcoin transaction.
[[tx_propagation]]
==== Propagating Transactions on the Bitcoin Network
Once a bitcoin transaction is sent to any node connected to the bitcoin network, the transaction will be validated by that node. If valid, that node will propagate it to the other nodes it is connected to and a success message will be returned synchronously to the originator. If the transaction is invalid, the node will reject it and synchronously return a rejection message to the originator. The bitcoin network is a peer-to-peer network, meaning that each bitcoin node is connected to a few other bitcoin nodes which it discovers during startup through the peer-to-peer protocol. The entire network forms a loosely connected mesh without a fixed topology or any structure, making all nodes equal peers. Messages, including transactions and blocks are propagated from each node to the peers it is connected to. A new validated transaction injected into any node on the network will be sent to 3-4 of the neighboring nodes, each of which will send it to 3-4 more nodes and so on. In this way, within a few seconds a valid transaction will propagate in an exponentially expanding ripple across the network until all connected nodes have received it. The bitcoin network is designed to propagate transactions and blocks to all nodes in an efficient and resilient manner that is resistant to attacks. To prevent spamming, denial of service attacks or other nuisance attacks agains the bitcoin system, every node will independently validate every transaction before propagating it further. A malformed transaction will not get beyond one node. The rules by which transactions are validated are explained in more detail in <<tx_validation>>
Once a bitcoin transaction is sent to any node connected to the bitcoin network, the transaction will be validated by that node. If valid, that node will propagate it to the other nodes to which it is connected and a success message will be returned synchronously to the originator. If the transaction is invalid, the node will reject it and synchronously return a rejection message to the originator. The bitcoin network is a peer-to-peer network meaning that each bitcoin node is connected to a few other bitcoin nodes that it discovers during startup through the peer-to-peer protocol. The entire network forms a loosely connected mesh without a fixed topology or any structure making all nodes equal peers. Messages, including transactions and blocks, are propagated from each node to the peers to which it is connected. A new validated transaction injected into any node on the network will be sent to 3 to 4 of the neighboring nodes, each of which will send it to 3 to 4 more nodes and so on. In this way, within a few seconds a valid transaction will propagate in an exponentially expanding ripple across the network until all connected nodes have received it. The bitcoin network is designed to propagate transactions and blocks to all nodes in an efficient and resilient manner that is resistant to attacks. To prevent spamming, denial of service attacks, or other nuisance attacks against the bitcoin system, every node will independently validate every transaction before propagating it further. A malformed transaction will not get beyond one node. The rules by which transactions are validated are explained in more detail in <<tx_validation>>.
[[tx_mining]]
==== Mining Transactions into Blocks
@ -43,7 +43,7 @@ The blockchain forms the authoritative ledger of all transactions since bitcoin'
A transaction is a data structure that encodes a transfer of value from a source of funds, called an "input", to a destination, called an "output". Transaction inputs and outputs are not related to accounts or identities. Instead you should think of them as bitcoin amounts, chunks of bitcoin, being locked with a specific secret which only the owner, or person know knows the secret, can unlock.
A transaction contains a number of fields, in addition to the inputs and outputs, as follows:
A transaction contains a number of fields, as follows:
[[tx_data_structure]]
.The structure of a transaction
@ -63,20 +63,20 @@ Note: Locktime defines the earliest time that a transaction can be added to the
[[tx_inputs_outputs]]
=== Transaction Outputs and Inputs
The fundamental building block of a bitcoin transaction is an _unspent transaction output_ or UTXO. UTXO are indivisible chunks of bitcoin currency locked to a specific owner, recorded on the blockchain, and recognized as currency units by the entire network. The bitcoin network tracks all available (unspent) UTXO currently numbering in the millions. Whenever a user receives bitcoin, that amount is recorded within the blockchain as a UTXO. Thus, a user's bitcoin may be scattered as UTXO amongst hundreds of transactions and hundreds of blocks. In effect, there is no such thing as a balance of a bitcoin address or account; there are only scattered UTXO, locked to specific owners. The concept of a user's bitcoin balance is a construct created by the wallet application. The wallet calculates the user's balance by scanning the blockchain and aggregating all UTXO belonging to that user.
The fundamental building block of a bitcoin transaction is an _unspent transaction output_ or UTXO. UTXO are indivisible chunks of bitcoin currency locked to a specific owner, recorded on the blockchain, and recognized as currency units by the entire network. The bitcoin network tracks all available (unspent) UTXO currently numbering in the millions. Whenever a user receives bitcoin, that amount is recorded within the blockchain as a UTXO. Thus, a user's bitcoin may be scattered as UTXO amongst hundreds of transactions and hundreds of blocks. In effect, there is no such thing as a stored balance of a bitcoin address or account; there are only scattered UTXO, locked to specific owners. The concept of a user's bitcoin balance is a derived construct created by the wallet application. The wallet calculates the user's balance by scanning the blockchain and aggregating all UTXO belonging to that user.
[TIP]
====
There are no accounts or balances in bitcoin, there are only _unspent transaction outputs_ (UTXO) scattered in the blockchain.
====
Unlike cash which exists in specific denominations, one dollar, five dollars, ten dollars, etc., a UTXO can have any arbitrary value denominated as a multiple of satoshis (the smallest bitcoin unit equal to 100 millionth of a bitcoin). While UTXO can be any arbitrary value, once created it is indivisible just like a coin that cannot be cut in half. If a UTXO is larger than the desired value of a transaction, it must still be consumed in its entirety and change must be generated in the transaction. In other words, if you have a 20 bitcoin UTXO and want to pay 1 bitcoin, your transaction must consume the entire 20 bitcoin UTXO and produce two outputs: one paying 1 bitcoin to your desired recipient and another paying 19 bitcoin in change back to your wallet. As a result, bitcoin transactions must occasionally generate change.
Unlike cash, which exists in specific denominations (one dollar, five dollars, ten dollars), a UTXO can have any arbitrary value denominated as a multiple of satoshis (the smallest bitcoin unit equal to 100 millionth of a bitcoin). While UTXO can be any arbitrary value, once created it is indivisible just like a coin that cannot be cut in half. If a UTXO is larger than the desired value of a transaction, it must still be consumed in its entirety and change must be generated in the transaction. In other words, if you have a 20 bitcoin UTXO and want to pay 1 bitcoin, your transaction must consume the entire 20 bitcoin UTXO and produce two outputs: one paying 1 bitcoin to your desired recipient and another paying 19 bitcoin in change back to your wallet. As a result, most bitcoin transactions will generate change.
In simple terms, transactions consume the sender's available UTXO and create new UTXO locked to the recipient's bitcoin address. Imagine a shopper buying a $1.50 beverage, reaching into their wallet and trying to find a combination of coins and bank notes to cover the $1.50 cost. The shopper will choose exact change if available (a dollar bill and two quarters), or a combination of smaller denominations (six quarters), or if necessary, a larger unit such as a bank note (five dollar note). If they hand too much money, say $5, to the shop owner they will expect $3.50 change, which they will return to their wallet and have available for future transactions. Similarly, a bitcoin transaction must be created from a users UTXO in whatever denominations that user has available. They cannot cut a UTXO in half anymore than they can cut a dollar bill in half and use it as currency. The user's wallet application will typically select from the users available UTXO various units to compose an amount greater than or equal to the desired transaction amount. As with real life, the bitcoin application can use several strategies to satisfy the purchase amount: combining several smaller units, finding exact change, or using a single unit larger than the transaction value and making change.
In simple terms, transactions consume the sender's available UTXO and create new UTXO locked to the recipient's bitcoin address. Imagine a shopper buying a $1.50 beverage, reaching into their wallet and trying to find a combination of coins and bank notes to cover the $1.50 cost. The shopper will choose exact change if available (a dollar bill and two quarters), or a combination of smaller denominations (six quarters), or if necessary, a larger unit such as a five dollar bank note. If they hand too much money, say $5, to the shop owner they will expect $3.50 change, which they will return to their wallet and have available for future transactions. Similarly, a bitcoin transaction must be created from a user's UTXO in whatever denominations that user has available. They cannot cut a UTXO in half any more than they can cut a dollar bill in half and use it as currency. The user's wallet application will typically select from the user's available UTXO various units to compose an amount greater than or equal to the desired transaction amount. As with real life, the bitcoin application can use several strategies to satisfy the purchase amount: combining several smaller units, finding exact change, or using a single unit larger than the transaction value and making change.
The UTXO consumed by a transaction are called transaction inputs, while the UTXO created by a transaction are called transaction outputs. This way, chunks of bitcoin value move forward from owner to owner in a chain of transactions consuming and creating UTXO. Transactions consume UTXO unlocking it with the signature of the current owner and create UTXO locking it to the bitcoin address of the new owner.
The exception to the output and input chain is a special type of transaction called the _coinbase_ transaction, which is the first transaction in each block. This transaction is placed there by the "winning" miner and creates brand-new bitcoin payable to that miner as a reward for mining. This is how bitcoin's money supply is created during the mining process as we will see in <<mining>>
The exception to the output and input chain is a special type of transaction called the _coinbase_ transaction, which is the first transaction in each block. This transaction is placed there by the "winning" miner and creates brand-new bitcoin payable to that miner as a reward for mining. This is how bitcoin's money supply is created during the mining process as we will see in <<mining>>.
[TIP]
@ -136,13 +136,13 @@ Note: The sequence number is used to override a transaction prior to the expirat
[[tx_fees]]
==== Transaction Fees
Most transactions include transactions fees that compensate the bitcoin miners for securing the network. Mining and the fees and rewards collected by miners are discussed in more detail in <<mining>>. This section examines how transaction fees are included in a typical transaction. Most wallets calculate and include transaction fees automatically. However, if you are constructing transactions programmatically, or using a command line interface, you must manually account for and include fees.
Most transactions include transaction fees, which compensate the bitcoin miners for securing the network. Mining and the fees and rewards collected by miners are discussed in more detail in <<mining>>. This section examines how transaction fees are included in a typical transaction. Most wallets calculate and include transaction fees automatically. However, if you are constructing transactions programmatically, or using a command line interface, you must manually account for and include these fees.
Transaction fees serve as an incentive to include (mine) a transaction into the next block and also as a disincentive against "spam" transactions or any kind of abuse of the system, by imposing a small cost on every transaction. Transaction fees are collected by the miner who mines the block that records the transaction on the blockchain.
Transaction fees are calculated based on the size of the transaction in kilobytes, not the value of the transaction in bitcoin. Overall, transaction fees are set based on market forces within the bitcoin network. Miners prioritize transactions based on many different criteria, including fees and may even process transactions for free under certain circumstances. Transaction fees affect the processing priority, meaning that a transaction with sufficient fees is likely to be included in the next-most mined block, while a transaction with insufficient or no fees may be delayed, on a best-effort basis and processed after a few blocks or not at all. Transaction fees are not mandatory and transactions without fees may be processed, eventually, but including transaction fees encourages priority processing.
Transaction fees are calculated based on the size of the transaction in kilobytes, not the value of the transaction in bitcoin. Overall, transaction fees are set based on market forces within the bitcoin network. Miners prioritize transactions based on many different criteria, including fees and may even process transactions for free under certain circumstances. Transaction fees affect the processing priority, meaning that a transaction with sufficient fees is likely to be included in the next-most mined block, while a transaction with insufficient or no fees may be delayed, on a best-effort basis and processed after a few blocks or not at all. Transaction fees are not mandatory and transactions without fees may be processed eventually; however, including transaction fees encourages priority processing.
Over time, the way transaction fees are calculated and the effect they have on transaction prioritization has been changing. At first, transaction fees were fixed and constant across the network. Gradually, the fee structure has been relaxed so that it may be influenced by market forces, based on network capacity and transaction volume. The current minimum transaction fee is fixed at 0.0001 bitcoin or a tenth of a milli-bitcoin, recently decreased from one milli-bitcoin, per kilobyte. Most transactions are less than one kilobyte, however those with multiple inputs or outputs can be larger. In future revisions of the bitcoin protocol it is expected that wallet applications will use statistical analysis to calculate the most appropriate fee to attach to a transaction based on the average fees of recent transactions.
Over time, the way transaction fees are calculated and the effect they have on transaction prioritization has been evolving. At first, transaction fees were fixed and constant across the network. Gradually, the fee structure has been relaxed so that it may be influenced by market forces, based on network capacity and transaction volume. The current minimum transaction fee is fixed at 0.0001 bitcoin or a tenth of a milli-bitcoin, recently decreased from one milli-bitcoin, per kilobyte. Most transactions are less than one kilobyte; however, those with multiple inputs or outputs can be larger. In future revisions of the bitcoin protocol it is expected that wallet applications will use statistical analysis to calculate the most appropriate fee to attach to a transaction based on the average fees of recent transactions.
The current algorithm used by miners to prioritize transactions for inclusion in a block based on their fees will be examined in detail in <<mining>>.
@ -159,13 +159,13 @@ Fees = Sum(Inputs) - Sum(Outputs)
This is a somewhat confusing element of transactions and an important point to understand, because if you are constructing your own transactions you must ensure you do not inadvertently include a very large fee by underspending the inputs. That means that you must account for all inputs, if necessary by creating change, or you will end up giving the miners a very big tip!
For example, if you consume a 20 bitcoin UTXO to make a 1 bitcoin payment, you must include a 19 bitcoin change output back to your wallet. Otherwise, the 19 bitcoin "leftover" will be counted as a transaction fee and will be collected by the miner who mines your transaction in a block. While you will receive priority processing and make a miner very happy, this is probably not what you intended.
[WARNING]
====
If you forget to add a change output in a manually constructed transaction you will be paying the change as a transaction fee. "Keep the change!" may not be what you intended.
====
For example, if you consume a 20 bitcoin UTXO to make a 1 bitcoin payment, you must include a 19 bitcoin change output back to your wallet. Otherwise, the 19 bitcoin "leftover" will be counted as a transaction fee and will be collected by the miner who mines your transaction in a block. While you will receive priority processing and make a miner very happy, this is probably not what you intended.
Let's see how this works in practice, by looking at Alice's coffee purchase again. Alice wants to spend 0.015 bitcoin to pay for coffee. To ensure this transaction is processed promptly, she will want to include a transaction fee, say 0.001. That will mean that the total cost of the transaction will be 0.016. Her wallet must therefore source a set of UTXO that adds up to 0.016 bitcoin or more and if necessary create change. Let's say her wallet has a 0.2 bitcoin UTXO available. It will therefore need to consume this UTXO, create one output to Bob's Cafe for 0.015, and a second output with 0.184 bitcoin in change back to her own wallet, leaving 0.001 bitcoin unallocated, as an implicit fee for the transaction.
Now let's look at a different scenario. Eugenia, our children's charity director in the Philippines has completed a fundraiser to purchase school books for the children. She received several thousand small donations from people all around the world, totaling 50 . Now, she wants to purchase hundreds of school books from a local publisher, paying in bitcoin. The charity received thousands of small donations from all around the world. As Eugenia's wallet application tries to construct a single larger payment transaction, it must source from the available UTXO set which is composed of many smaller amounts. That means that the resulting transaction will source from more than a hundred small-value UTXO as inputs and only one output, paying the book publisher. A transaction with that many inputs will be larger than one kilobyte, perhaps 2-3 kilobytes in size. As a result, it will require a higher fee than the minimal network fee of 0.0001 bitcoin. Eugenia's wallet application will calculate the appropriate fee by measuring the size of the transaction and multiplying that by the per-kilobyte fee. Many wallets will overpay fees for larger transactions to ensure the transaction is processed promptly. The higher fee is not because Eugenia is spending more money, but because her transaction is more complex and larger in size - the fee is independent of the transaction's bitcoin value.
@ -190,18 +190,18 @@ This is only the tip of the iceberg of possibilities that can be expressed with
[TIP]
====
Bitcoin transaction validation is not based on a static pattern, but instead is achieved through the execution of a scripting language. This language allows for a near infinite variety of conditions to be expressed. This is how bitcoin gets the power of "programmable money"
Bitcoin transaction validation is not based on a static pattern, but instead is achieved through the execution of a scripting language. This language allows for a nearly infinite variety of conditions to be expressed. This is how bitcoin gets the power of "programmable money".
====
==== Script Construction (Lock + Unlock)
Bitcoin's transaction validation engine relies on two types of scripts to validate transactions - a locking script and an unlocking script.
Bitcoin's transaction validation engine relies on two types of scripts to validate transactions -- a locking script and an unlocking script.
A locking script is an encumbrance placed on an output, that specifies the conditions that must be met to spend the output in the future. Historically, the locking script was called a _scriptPubKey_, because it usually contained a public key or bitcoin address. In this book we refer to it as a "locking script" to acknowledge the much broader range of possibilities of this scripting technology. In most bitcoin applications, what we refer to as a locking script will appear in the source code as "scriptPubKey".
A locking script is an encumbrance placed on an output, and it specifies the conditions that must be met to spend the output in the future. Historically, the locking script was called a _scriptPubKey_, because it usually contained a public key or bitcoin address. In this book we refer to it as a "locking script" to acknowledge the much broader range of possibilities of this scripting technology. In most bitcoin applications, what we refer to as a locking script will appear in the source code as "scriptPubKey".
An unlocking script is a script that "solves", or satisfies, the conditions placed on an output by a locking script and allows the output to be spent. Unlocking scripts are part of every transaction input and most of the time they contain a digital signature produced by the user's wallet from their private key. Historically, the unlocking script is called _scriptSig_, because it usually contained a digital signature. In this book we refer to it as an "unlocking script" to acknowledge the much broader range of locking script requirements, as not all unlocking scripts must contain signatures. As mentioned above, in most bitcoin applications the source code will refer to the unlocking script as "scriptSig".
Every bitcoin client will validate transaction by executing the locking and unlocking scripts together. For each input in the transaction, the validation software will first retrieve the UTXO referenced by the input. That UTXO contains a locking script defining the conditions required to spend it. The validation software will then take the unlocking script contained in the input that is attempting to spend this UTXO and concatenate them. The locking script is added to the end of the unlocking script and then the entire combined script is executed using the script execution engine. If the result of executing the combined script is "TRUE", the unlocking script has succeeded in resolving the conditions imposed by the locking script and therefore the input is a valid authorization to spend the UTXO. If any result other than "TRUE" remains after execution of the combined script, the input is invalid as it has failed to satisfy the spending conditions placed on the UTXO. Note that the UTXO is permanently recorded in the blockchain, and therefore is invariable and is unaffected by failed attempts to spend it by reference in a new transaction. Only a valid transaction that correctly satisfies the conditions of the UTXO results in the UTXO being marked as "spent" and removed from the set of available UTXO.
Every bitcoin client will validate transactions by executing the locking and unlocking scripts together. For each input in the transaction, the validation software will first retrieve the UTXO referenced by the input. That UTXO contains a locking script defining the conditions required to spend it. The validation software will then take the unlocking script contained in the input that is attempting to spend this UTXO and concatenate them. The locking script is added to the end of the unlocking script and then the entire combined script is executed using the script execution engine. If the result of executing the combined script is "TRUE", the unlocking script has succeeded in resolving the conditions imposed by the locking script and therefore the input is a valid authorization to spend the UTXO. If any result other than "TRUE" remains after execution of the combined script, the input is invalid as it has failed to satisfy the spending conditions placed on the UTXO. Note that the UTXO is permanently recorded in the blockchain, and therefore is invariable and is unaffected by failed attempts to spend it by reference in a new transaction. Only a valid transaction that correctly satisfies the conditions of the UTXO results in the UTXO being marked as "spent" and removed from the set of available (unspent) UTXO.
Below is an example of the unlocking and locking scripts for the most common type of bitcoin transaction (a payment to a public key hash), showing the combined script resulting from the concatenation of the unlocking and locking scripts prior to script validation:
@ -260,11 +260,11 @@ Transactions are valid if the top result on the stack is TRUE (1), any other non
==== Turing Incompleteness
The bitcoin transaction script language contains many operators but is deliberately limited in one important way - there are no loops or complex flow control capabilities other than conditional flow control. This ensures that the language is not Turing Complete, meaning that scripts have limited complexity and predictable execution times. These limitations ensure that the language cannot be used to create an infinite loop or other form of "logic bomb" that could be embedded in a transaction in a way that causes a Denial-of-Service attack against the bitcoin network. Remember, every transaction is validated by every full node on the bitcoin network. A limited language prevents the transaction validation mechanism from being used as a vulnerability.
The bitcoin transaction script language contains many operators but is deliberately limited in one important way -- there are no loops or complex flow control capabilities other than conditional flow control. This ensures that the language is not Turing Complete, meaning that scripts have limited complexity and predictable execution times. Script it is not a general-purpose language. These limitations ensure that the language cannot be used to create an infinite loop or other form of "logic bomb" that could be embedded in a transaction in a way that causes a Denial-of-Service attack against the bitcoin network. Remember, every transaction is validated by every full node on the bitcoin network. A limited language prevents the transaction validation mechanism from being used as a vulnerability.
==== Stateless Verification
The bitcoin transaction script language is stateless, in that there is no state prior to execution of the script, or state saved after execution of the script. Therefore, all the information needed to execute a script is contained within the script. A script will predictably execute the same way on any system. If your system verifies a script you can be sure that every other system in the bitcoin network will also verify the script, meaning that a valid transaction is valid for everyone and everyone knows this. This predictability of outcomes is a key benefit of the bitcoin system.
The bitcoin transaction script language is stateless, in that there is no state prior to execution of the script, or state saved after execution of the script. Therefore, all the information needed to execute a script is contained within the script. A script will predictably execute the same way on any system. If your system verifies a script, you can be sure that every other system in the bitcoin network will also verify the script, meaning that a valid transaction is valid for everyone and everyone knows this. This predictability of outcomes is an essential benefit of the bitcoin system.
[[std_tx]]
=== Standard Transactions
@ -364,7 +364,7 @@ The two scripts together would form the combined validation script below:
OP_0 <Signature B> <Signature C> 2 <Public Key A> <Public Key B> <Public Key C> 3 OP_CHECKMULTISIG
----
When executed, this combined script will evaluate to TRUE if, and only if, the unlocking script matches the conditions set by the locking script, that is if the unlocking script has a valid signatures from the two private keys which correspond to two of the three public keys set as an encumbrance.
When executed, this combined script will evaluate to TRUE if, and only if, the unlocking script matches the conditions set by the locking script. In this case, the condition is whether the unlocking script has a valid signature from the two private keys that correspond to two of the three public keys set as an encumbrance.
[[op_return]]
==== Data Output (OP_RETURN)
@ -383,7 +383,7 @@ OP_RETURN <data>
where the data portion is limited to 40 bytes and most often represents a hash, such as the output from the SHA256 algorithm (32 bytes). Many applications put a prefix in front of the data to help identify the application. For example, the proofofexistence.com digital notarization service uses the 8-byte prefix "DOCPROOF" which is ASCII encoded as 44f4350524f4f46 in hexadecimal.
Keep in mind that there is no "unlocking script" that corresponds to OP_RETURN, that can be used to "spend" an OP_RETURN output. The whole point of OP_RETURN is that you can't spend the money locked in that output and therefore it does not need to be held in the UTXO set as potentially spendable - OP_RETURN is _provably un-spendable_. OP_RETURN is usually an output with a zero bitcoin amount, since any bitcoin assigned to such an output is effectively lost forever. If an OP_RETURN is encountered by the script validation software it results immediately in halting the execution of the validation script and marking the transaction as invalid. Thus, if you accidentally reference an OP_RETURN output as an input in a transaction, that transaction is invalid.
Keep in mind that there is no "unlocking script" that corresponds to OP_RETURN that could possibly be used to "spend" an OP_RETURN output. The whole point of OP_RETURN is that you can't spend the money locked in that output and therefore it does not need to be held in the UTXO set as potentially spendable - OP_RETURN is _provably un-spendable_. OP_RETURN is usually an output with a zero bitcoin amount, since any bitcoin assigned to such an output is effectively lost forever. If an OP_RETURN is encountered by the script validation software, it results immediately in halting the execution of the validation script and marking the transaction as invalid. Thus, if you accidentally reference an OP_RETURN output as an input in a transaction, that transaction is invalid.
A valid transaction can have only one OP_RETURN output. However, a single OP_RETURN output can be combined in a transaction with outputs of any other type.
@ -407,7 +407,7 @@ Pay-to-Script-Hash (P2SH) was developed to resolve these practical difficulties
In P2SH transactions, the locking script that is replaced by a hash is referred to as the _redeemScript_ because it is presented to the system at redemption time rather than as a locking script.
[[without_p2sh]]
.Complex Script Without P2SH
.Complex Script without P2SH
|=======
| Locking Script | 2 PubKey1 PubKey2 PubKey3 PubKey4 PubKey5 5 OP_CHECKMULTISIG
| Unlocking Script | Sig1 Sig2
@ -451,7 +451,7 @@ A P2SH transaction locks the output to this hash instead of the longer script, u
----
OP_HASH160 54c557e07dde5bb6cb791c7a540e0a4796f5e97e OP_EQUAL
----
which, as you can see is much much shorter. Instead of "pay to this 5-key multi-signature script", the P2SH equivalent transaction is "pay to a script with this hash". A customer making a payment to Mohammed's company need only include this much shorter locking script in their payment. When Mohammed wants to spend this UTXO, they must present the original redeemScript (the one whose hash locked the UTXO) and the signatures necessary to unlock it, like this:
which, as you can see is much shorter. Instead of "pay to this 5-key multi-signature script", the P2SH equivalent transaction is "pay to a script with this hash". A customer making a payment to Mohammed's company need only include this much shorter locking script in their payment. When Mohammed wants to spend this UTXO, they must present the original redeemScript (the one whose hash locked the UTXO) and the signatures necessary to unlock it, like this:
----
<Sig1> <Sig2> <2 PK1 PK2 PK3 PK4 PK5 5 OP_CHECKMULTISIG>
@ -479,7 +479,7 @@ The Pay-to-Script-Hash feature offers the following benefits compared to the dir
* Complex scripts are replaced by shorter fingerprint in the transaction output, making the transaction smaller
* Scripts can be coded as an address, so the sender and the sender's wallet don't need complex engineering to implement P2SH
* P2SH shifts the burden of constructing the script to the recipient not the sender
* P2SH shifts the burden in data storage for the long script from the output (which is in UTXO set) to the input (only stored on the blockchain)
* P2SH shifts the burden in data storage for the long script from the output (which is in the UTXO set and therefore impacts memory) to the input (only stored on the blockchain)
* P2SH shifts the burden in data storage for the long script from the present time (payment) to a future time (when it is spent)
* P2SH shifts the transaction fee cost of a long script from the sender to the recipient who has to include the long redeemScript to spend it
@ -487,7 +487,7 @@ The Pay-to-Script-Hash feature offers the following benefits compared to the dir
Pay-to-Script-Hash is currently limited to the standard types of bitcoin transaction scripts, by the +isStandard()+ function. That means that the redeemScript presented in the spending transaction must be one of the standard types: P2PK, P2PKH or Multi-Sig, excluding OP_RETURN and P2SH itself. You cannot reference a P2SH script inside a redeemScript and you can't use an OP_RETURN inside a P2SH redeemScript.
This limitation of redeemScript to only standard transaction scripts is temporary and will likely be removed in future versions of the bitcoin reference implementation, allowing the use of any valid script inside a P2SH redeemScript. You will still not be able to put a P2SH inside a P2SH redeemScript, because the P2SH specification is not recursive. You will still not be able to use OP_RETURN in a redeemScript because OP_RETURN cannot be redeemed by definition. But you will be able to use all the other operators to create a vast range of complex and novel scripts that can be used as redeemScripts and referenced as P2SH payment to their hash.
This limitation of redeemScript to only standard transaction scripts is temporary and will likely be removed in future versions of the bitcoin reference implementation, allowing the use of any valid script inside a P2SH redeemScript. You will still not be able to put a P2SH inside a P2SH redeemScript, because the P2SH specification is not recursive. You will still not be able to use OP_RETURN in a redeemScript because OP_RETURN cannot be redeemed by definition. But you will be able someday to use all the other operators to create a vast range of complex and novel scripts that can be used as redeemScripts and referenced as P2SH payment to their hash.
Note that since the redeemScript is not presented to the network until you attempt to spend a P2SH output, if you lock an output with the hash of a non-standard transaction it will be processed as valid. However, you will not be able to spend it as the spending transaction which includes the redeemScript will not be accepted, as it is non-standard. This creates a risk, as you can lock bitcoin in a P2SH which cannot be later spent. The network will accept the P2SH encumbrance even if it corresponds to a non-standard or invalid redeemScript, because the script hash gives no indication of the script it represents.
@ -512,7 +512,7 @@ P2SH locking scripts contain the hash of a redeemScript which gives no clues as
| OP_1NEGATE | 0x4f | Push the value "-1" onto the stack
| OP_RESERVED | 0x50 | Halt - Invalid transaction unless found in an unexecuted OP_IF clause
| OP_1 or OP_TRUE| 0x51 | Push the value "1" onto the stack
| OP_2 to OP_16 | 0x52 to 0x60 | For OP_N, push the value "N" onto the stack. eg. OP_2 pushes "2"
| OP_2 to OP_16 | 0x52 to 0x60 | For OP_N, push the value "N" onto the stack. E.g., OP_2 pushes "2"
|=======
[[tx_script_ops_table_control]]

@ -9,7 +9,7 @@ Bitcoin is structured as a peer-to-peer network architecture on top of the Inter
Bitcoin's P2P network architecture is much more than a topology choice. Bitcoin is a peer-to-peer digital cash system by design, and the network architecture is both a reflection and a foundation of that core characteristic. De-centralization of control is a core design principle and that can only be achieved and maintained by a flat, de-centralized P2P consensus network.
The term "bitcoin network" refers to the collection of nodes running the bitcoin P2P protocol. In addition to the bitcoin P2P protocol, there are other protocols such as Stratum, that are used for mining and lightweight or mobile wallets. These additional protocols are provided by gateway routing servers that access the bitcoin network using the bitcoin P2P protocol and then extend that network to nodes running other protocols. For example, Stratum servers connect Stratum mining nodes via the Stratum protocol to the main bitcoin network and bridge the Stratum protocol to the bitcoin P2P protocol. We use the term "extended bitcoin network" to refer to the overall network that includes the bitcoin P2P protocol, pool mining protocols, the Stratum protocol, and any other related protocols connecting the components of the bitcoin system.
The term "bitcoin network" refers to the collection of nodes running the bitcoin P2P protocol. In addition to the bitcoin P2P protocol, there are other protocols such as Stratum, which are used for mining and lightweight or mobile wallets. These additional protocols are provided by gateway routing servers that access the bitcoin network using the bitcoin P2P protocol and then extend that network to nodes running other protocols. For example, Stratum servers connect Stratum mining nodes via the Stratum protocol to the main bitcoin network and bridge the Stratum protocol to the bitcoin P2P protocol. We use the term "extended bitcoin network" to refer to the overall network that includes the bitcoin P2P protocol, pool mining protocols, the Stratum protocol, and any other related protocols connecting the components of the bitcoin system.
=== Nodes Types and Roles
@ -25,7 +25,7 @@ Some nodes, called full nodes, also maintain a complete and up-to-date copy of t
Mining nodes compete to create new blocks by running specialized hardware to solve the proof-of-work algorithm. Some mining nodes are also full nodes, maintaining a full copy of the blockchain while others are lightweight nodes participating in pool mining and depending on a pool server to maintain a full node. The mining function is shown in the full node above as a black circle named "Miner".
User wallets may be part of a full node, as is usually the case with desktop bitcoin clients. Increasingly many user wallets, especially those running on resource constrained devices such as smart phones, are SPV nodes. The wallet function is shown above as a green circle named "Wallet".
User wallets may be part of a full node, as is usually the case with desktop bitcoin clients. Increasingly many user wallets, especially those running on resource-constrained devices such as smart phones, are SPV nodes. The wallet function is shown above as a green circle named "Wallet".
In addition to the main node types on the bitcoin P2P protocol, there are servers and nodes running other protocols, such as specialized mining pool protocols and lightweight client access protocols.
@ -37,9 +37,9 @@ image::images/BitcoinNodeTypes.png["BitcoinNodeTypes"]
=== The Extended Bitcoin Network
The main bitcoin network, running the bitcoin P2P protocol consists of between 7,000 to 10,000 nodes running various versions of the bitcoin reference client (Bitcoin Core) and a few hundred nodes running various other implementations of the bitcoin P2P protocol, such as BitcoinJ, Libbitcoin, and btcd. A small percentage of the nodes on the bitcoin P2P network are also mining nodes, competing in the mining process, validating transactions, and creating new blocks. Various large companies interface with the bitcoin network by running full-node clients based on the Bitcoin Core client, with full copies of the blockchain and a network node, but without mining or wallet functions. These nodes act as network edge routers, allowing various other services (exchanges, wallets, block explorers, merchant payment processing) to be built on top.
The main bitcoin network, running the bitcoin P2P protocol, consists of between 7,000 to 10,000 nodes running various versions of the bitcoin reference client (Bitcoin Core) and a few hundred nodes running various other implementations of the bitcoin P2P protocol, such as BitcoinJ, Libbitcoin, and btcd. A small percentage of the nodes on the bitcoin P2P network are also mining nodes, competing in the mining process, validating transactions, and creating new blocks. Various large companies interface with the bitcoin network by running full-node clients based on the Bitcoin Core client, with full copies of the blockchain and a network node, but without mining or wallet functions. These nodes act as network edge routers, allowing various other services (exchanges, wallets, block explorers, merchant payment processing) to be built on top.
The extended bitcoin network includes the network running the bitcoin P2P protocol, described above, as well as nodes running specialized protocols. Attached to the main bitcoin P2P network are a number of pool servers and protocol gateways that connect nodes running other protocols, mostly pool mining nodes (see <<mining>>) and lightweight wallet clients, which do not carry a full copy of the blockchain.
The extended bitcoin network includes the network running the bitcoin P2P protocol, described above, as well as nodes running specialized protocols. Attached to the main bitcoin P2P network are a number of pool servers and protocol gateways that connect nodes running other protocols. These other protocol nodes are mostly pool mining nodes (see <<mining>>) and lightweight wallet clients, which do not carry a full copy of the blockchain.
The diagram below shows the extended bitcoin network with the various types of nodes, gateway servers, edge routers, and wallet clients and the various protocols they use to connect to each other.
@ -51,7 +51,7 @@ image::images/BitcoinNetwork.png["BitcoinNetwork"]
When a new node boots up, it must discover other bitcoin nodes on the network in order to participate. To start this process, a new node must discover at least one existing node on the network and connect to it. The geographic location of the other nodes is irrelevant, the bitcoin network topology is not geographically defined. Therefore, any existing bitcoin nodes can be selected at random.
To connect to a known peer, nodes establish a TCP connection, usually to port 8333 (the bitcoin "well known" port), or an alternative port if one is provided. Upon establishing a connection, the node will start a "handshake" by transmitting a +version+ message, which contains basic identifying information, including:
To connect to a known peer, nodes establish a TCP connection, usually to port 8333 (the bitcoin "well known" port), or an alternative port if one is provided. Upon establishing a connection, the node will start a "handshake" by transmitting a +version+ message, which contains basic identifying information, including:
* PROTOCOL_VERSION, a constant that defines the bitcoin P2P protocol version the client "speaks". E.g. 70002
* nLocalServices, a list of local services supported by the node, currently just NODE_NETWORK
@ -71,14 +71,14 @@ image::images/NetworkHandshake.png["NetworkHandshake"]
How does a new node find peers? While there are no special nodes in bitcoin, there are some long running stable nodes that are listed in the client as _seed nodes_. While a new node does not have to connect with the seed nodes, it can use them to quickly discover other nodes in the network. In the Bitcoin Core client, the option to use the seed nodes is controlled by the option switch +-dnsseed+, which is set to 1, to use the seed nodes, by default. Alternatively, a bootstrapping node that knows nothing of the network must be given the IP address of at least one bitcoin node after which it can establish connections through further introductions. The command line argument +-seednode+ can be used to connect to one node just for introductions, using it as a DNS seed. After the initial seed node is used to form introductions, the client will disconnect from it and use the newly discovered peers.
Once one or more connections are established, the new node will send an +addr+ message containing its own IP address, to its neighbors. The neighbors will in turn forward the +addr+ message to their neighbors, ensuring that the newly connected node becomes well known and better connected. Additionally, the newly connected node can send +getaddr+ to the neighbors asking them to return a list of IP addresses of other peers. That way, a node can find peers to connect to and advertise its existence on the network for other nodes to find it.
Once one or more connections are established, the new node will send an +addr+ message containing its own IP address, to its neighbors. The neighbors will in turn forward the +addr+ message to their neighbors, ensuring that the newly connected node becomes well known and better connected. Additionally, the newly connected node can send +getaddr+ to the neighbors, asking them to return a list of IP addresses of other peers. That way, a node can find peers to connect to and advertise its existence on the network for other nodes to find it.
[[address_propagation]]
.Address Propagation and Discovery
image::images/AddressPropagation.png["AddressPropagation"]
A node must connect to a few different peers in order to establish diverse paths into the bitcoin network. These paths are not reliable, nodes come and go, and so the node must continue to discover new nodes as it loses old connections as well as assist other nodes when they bootstrap. Only one connection is needed to bootstrap, as the first node can offer introductions to its peer nodes and those peers can offer further introductions. Its also unnecessary and wasteful of network resources to connect to more than a handful of nodes. After bootstrapping, a node will remember its most recent successful peer connections, so that if it is rebooted it can quickly reestablish connections with its former peer network. If none of the former peers respond to its connection request, the node can use the seed nodes to bootstrap again.
A node must connect to a few different peers in order to establish diverse paths into the bitcoin network. Paths are not reliable, nodes come and go, and so the node must continue to discover new nodes as it loses old connections as well as assist other nodes when they bootstrap. Only one connection is needed to bootstrap, as the first node can offer introductions to its peer nodes and those peers can offer further introductions. It's also unnecessary and wasteful of network resources to connect to more than a handful of nodes. After bootstrapping, a node will remember its most recent successful peer connections, so that if it is rebooted it can quickly reestablish connections with its former peer network. If none of the former peers respond to its connection request, the node can use the seed nodes to bootstrap again.
On a node running the Bitcoin Core client, you can list the peer connections with the command +getpeerinfo+:
----
@ -121,13 +121,13 @@ $ bitcoin-cli getpeerinfo
To override the automatic management of peers and to specify a list of IP addresses, users can provide the option +-connect=<IPAddress>+ and specify one or more IP addresses. If this option is used, the node will only connect to the selected IP addresses, instead of discovering and maintaining the peer connections automatically.
If there is no traffic on a connection, nodes will periodically send a message to maintain the connection. If a node has not communicated on a connection for more than 90 minutes, it is assumed to be disconnected and a new peer will be sought. Thus the network dynamically adjusts to transient nodes, network problems, and can organically grow and shrink as needed without any central control.
If there is no traffic on a connection, nodes will periodically send a message to maintain the connection. If a node has not communicated on a connection for more than 90 minutes, it is assumed to be disconnected and a new peer will be sought. Thus, the network dynamically adjusts to transient nodes, network problems, and can organically grow and shrink as needed without any central control.
=== Full Nodes
Full nodes are nodes that maintain a full blockchain, with all transactions. More accurately they probably should be called "full blockchain nodes". In the early years of bitcoin, all nodes were full nodes and currently the Bitcoin Core client is a full blockchain node. In the last two years however, new forms of bitcoin clients have been introduced, which do not maintain a full blockchain but run as lightweight clients. These are examined in more detail in the next section.
Full nodes are nodes that maintain a full blockchain, with all transactions. More accurately they probably should be called "full blockchain nodes". In the early years of bitcoin, all nodes were full nodes and currently the Bitcoin Core client is a full blockchain node. In the last two years, however, new forms of bitcoin clients have been introduced that do not maintain a full blockchain but run as lightweight clients. These are examined in more detail in the next section.
Full blockchain nodes maintain a complete and up-to-date copy of the bitcoin blockchain, with all the transactions, which they independently build and verify, starting with the very first block (genesis block) and up to the latest known block in the network. A full blockchain node can independently and authoritatively verify any transaction, without recourse or reliance on any other node or source of information. The full blockchain node relies on the network to receive updates about new blocks of transactions, which it then verifies and incorporates into its local copy of the blockchain.
Full blockchain nodes maintain a complete and up-to-date copy of the bitcoin blockchain, with all the transactions, which they independently build and verify, starting with the very first block (genesis block) and building up to the latest known block in the network. A full blockchain node can independently and authoritatively verify any transaction, without recourse or reliance on any other node or source of information. The full blockchain node relies on the network to receive updates about new blocks of transactions, which it then verifies and incorporates into its local copy of the blockchain.
Running a full blockchain node gives you the pure bitcoin experience: independent verification of all transactions without the need to rely on, or trust, any other systems. It's easy to tell if you're running a full node because it requires several gigabytes of persistent storage (disk space) to store the full blockchain. If you need a lot of disk and it takes 2-3 days to "sync" to the network, you are running a full node. That is the price of complete independence and freedom from central authority.
@ -139,7 +139,7 @@ The first thing a full node will do once it connects to peers is try to construc
The process of "syncing" the blockchain starts with the +version+ message, as that contains +BestHeight+, a node's current blockchain height (number of blocks). A node will see the +version+ messages from its peers, know how many blocks they each have and be able to compare to how many blocks it has in its own blockchain. Peered nodes will exchange a +getblocks+ message that contains the hash (fingerprint) of the top block on their local blockchain. One of the peers will be able to identify the received hash as belonging to a block that is not at the top, but rather belongs to an older block, thus deducing that its own local blockchain is longer than its peer's.
The peer that has the longer blockchain, has more blocks than the other node and can identify which blocks the other node needs to "catch up". It will identify the first 500 blocks to share and transmit their hashes using an +inv+ (inventory) message. The node missing these blocks will then retrieve them, by issuing a series of +getdata+ messages requesting the full block data and identifying the requested blocks using the hashes from the +inv+ message.
The peer that has the longer blockchain has more blocks than the other node and can identify which blocks the other node needs in order to "catch up". It will identify the first 500 blocks to share and transmit their hashes using an +inv+ (inventory) message. The node missing these blocks will then retrieve them, by issuing a series of +getdata+ messages requesting the full block data and identifying the requested blocks using the hashes from the +inv+ message.
Let's assume for example that a node only has the genesis block. It will then receive an +inv+ message from its peers containing the hashes of the next 500 blocks in the chain. It will start requesting blocks from all of its connected peers, spreading the load and ensuring that it doesn't overwhelm any peer with requests. The node keeps track of how many blocks are "in transit" per peer connection, meaning blocks that it has requested but not received, checking that it does not exceed a limit (MAX_BLOCKS_IN_TRANSIT_PER_PEER). This way, if it needs a lot of blocks, it will only request new ones as previous requests are fulfilled, allowing the peers to control the pace of updates and not overwhelming the network. As each block is received, it is added to the blockchain as we will see in the next chapter <<blockchain>>. The local blockchain is gradually built up, more blocks are requested and received and the process continues until the node catches up to the rest of the network.
@ -159,7 +159,7 @@ As an analogy, a full node is like a tourist in a strange city, equipped with a
Simple Payment Verification verifies transactions by reference to their _depth_ in the blockchain instead of their _height_. Whereas a full-blockchain node will construct a fully verified chain of thousands of blocks and transactions reaching down the blockchain (back in time) all the way to the genesis block, an SPV node will verify the chain of all blocks and link that chain to the transaction of interest.
For example, when examining a transaction in block 300,000, a full node links all 300,000 blocks down to the genesis block and builds a full database of UTXO, establishing the validity of the transaction by confirming that the UTXO remains unspent. An SPV node cannot validate whether the UTXO is unspent. Instead, the SPV node will establish a link between the transaction and the block that contains it, using a Merkle Path (see <<merkle_trees>>). Then, the SPV node waits until is sees the six blocks 300,001 through 300,006 piled on top of the block containing the transaction and verifies it by establishing its depth under blocks 300,006 to 300,001. The fact that other nodes on the network accepted block 300,000 and then did the necessary work to produce 6 more blocks on top of it is proof, by proxy, that the transaction was not a double-spend.
For example, when examining a transaction in block 300,000, a full node links all 300,000 blocks down to the genesis block and builds a full database of UTXO, establishing the validity of the transaction by confirming that the UTXO remains unspent. An SPV node cannot validate whether the UTXO is unspent. Instead, the SPV node will establish a link between the transaction and the block that contains it, using a Merkle Path (see <<merkle_trees>>). Then, the SPV node waits until it sees the six blocks 300,001 through 300,006 piled on top of the block containing the transaction and verifies it by establishing its depth under blocks 300,006 to 300,001. The fact that other nodes on the network accepted block 300,000 and then did the necessary work to produce 6 more blocks on top of it is proof, by proxy, that the transaction was not a double-spend.
An SPV node cannot be persuaded that a transaction exists in a block, when it does not in fact exist. The SPV node establishes the existence of a transaction in a block by requesting a merkle path proof and by validating the proof-of-work in the chain of blocks. However, a transaction's existence can be "hidden" from an SPV node. An SPV node can definitely prove that a transaction exists but cannot verify that a transaction, such as a double-spend of the same UTXO, doesn't exist because it doesn't have a record of all transactions. This type of attack can be used as a Denial-of-Service attack or as a double-spending attack against SPV nodes. To defend against this, an SPV node needs to connect randomly to several nodes, to increase the probability that it is in contact with at least one honest node. SPV nodes are therefore vulnerable to network partitioning attacks or Sybil attacks, where they are connected to fake nodes or fake networks and do not have access to honest nodes or the real bitcoin network.
@ -206,7 +206,7 @@ Here's an example of adding a pattern "A" to the simple bloom filter shown above
.Adding a pattern "A" to our simple bloom filter
image::images/Bloom2.png["Bloom2"]
Adding a second pattern is as simple as repeating this process. The pattern is hashed by each hash function in turn and the result is recorded by setting the bits to +1+. Note that as a bloom filter is filled with more patterns, a hash function result may coincide with a bit that is already set to +1+ in which case the bit is not changed. In essence, as more patterns record on overlapping bits, the bloom filter starts to become saturated with more bits set to +1+ and the accuracy of the filter decreases. This is why the filter is a probabilistic data structure - it gets less accurate as more patterns are added. The accuracy depends on the number of patterns added versus the size of the bit array (N) and number of hash functions (M). A larger bit array and more hash functions can record more patterns with higher accuracy. A smaller bit array or fewer hash functions will record fewer patterns and produce less accuracy.
Adding a second pattern is as simple as repeating this process. The pattern is hashed by each hash function in turn and the result is recorded by setting the bits to +1+. Note that as a bloom filter is filled with more patterns, a hash function result may coincide with a bit that is already set to +1+ in which case the bit is not changed. In essence, as more patterns record on overlapping bits, the bloom filter starts to become saturated with more bits set to +1+ and the accuracy of the filter decreases. This is why the filter is a probabilistic data structure -- it gets less accurate as more patterns are added. The accuracy depends on the number of patterns added versus the size of the bit array (N) and number of hash functions (M). A larger bit array and more hash functions can record more patterns with higher accuracy. A smaller bit array or fewer hash functions will record fewer patterns and produce less accuracy.
Below is an example of adding a second pattern "B" to the simple bloom filter:
@ -214,7 +214,7 @@ Below is an example of adding a second pattern "B" to the simple bloom filter:
.Adding a second pattern "B" to our simple bloom filter
image::images/Bloom3.png["Bloom3"]
To test if a pattern is part of a bloom filter, the pattern is hashed by each hash function and the resulting bit pattern is tested against the bit array. If all the bits indexed by the hash functions are set to +1+, then the patten is _probably_ recorded in the bloom filter. Since the bits may be set because of overlap from multiple patterns, the answer is not certain, but is rather probabilistic. In simple terms, a bloom filter positive match is a "Maybe, Yes".
To test if a pattern is part of a bloom filter, the pattern is hashed by each hash function and the resulting bit pattern is tested against the bit array. If all the bits indexed by the hash functions are set to +1+, then the pattern is _probably_ recorded in the bloom filter. Since the bits may be set because of overlap from multiple patterns, the answer is not certain, but is rather probabilistic. In simple terms, a bloom filter positive match is a "Maybe, Yes".
Below is an example of testing the existence of pattern "X" in the simple bloom filter. The corresponding bits are set to +1+, so the pattern is probably a match:
@ -230,8 +230,8 @@ Below is an example of testing the existence of pattern "Y" in the simple bloom
.Testing the existence of pattern "Y" in the bloom filter. The result is a definitive negative match, meaning "Definitely No"
image::images/Bloom5.png["Bloom5"]
Bitcoin's implementation of bloom filters is described in Bitcoin Improvement Proposal 37 (BIP0037). See <<bip0037>> or visit:
https://github.com/bitcoin/bips/blob/master/bip-0037.mediawiki
Bitcoin's implementation of bloom filters is described in Bitcoin Improvement Proposal 37 (BIP0037). See <<bip0037>> or visit:
https://github.com/bitcoin/bips/blob/master/bip-0037.mediawiki.
=== Bloom Filters and Inventory Updates
@ -254,9 +254,9 @@ When a transaction is added to the transaction pool, the orphan pool is checked
Both the transaction pool and orphan pool (where implemented) are stored in local memory and are not saved on persistent storage, rather they are dynamically populated from incoming network messages. When a node starts, both pools are empty and are gradually populated with new transactions received on the network.
Some implementations of the bitcoin client also maintain a UTXO database or UTXO pool which is the set of all unspent outputs on the blockchain. While the name "UTXO pool" sounds similar to the transaction pool, it represents a different set of data. Unlike the transaction and orphan pools, the UTXO pool is not initialized empty but instead contains millions of entries of unspent transaction outputs including some dating back to 2009. The UTXO pool may be housed in local memory or as an indexed database table on persistent storage.
Some implementations of the bitcoin client also maintain a UTXO database or UTXO pool, which is the set of all unspent outputs on the blockchain. While the name "UTXO pool" sounds similar to the transaction pool, it represents a different set of data. Unlike the transaction and orphan pools, the UTXO pool is not initialized empty but instead contains millions of entries of unspent transaction outputs including some dating back to 2009. The UTXO pool may be housed in local memory or as an indexed database table on persistent storage.
Whereas the transaction and orphan pools represent a single node's local perspective and may vary significantly from node to node depending upon when the node was started or restarted, the UTXO pool represents the emergent consensus of the network and therefore will vary little between nodes. Furthermore the transaction and orphan pools only contain unconfirmed transactions, while the UTXO pool only contains confirmed outputs.
Whereas the transaction and orphan pools represent a single node's local perspective and may vary significantly from node to node depending upon when the node was started or restarted, the UTXO pool represents the emergent consensus of the network and therefore will vary little between nodes. Furthermore, the transaction and orphan pools only contain unconfirmed transactions, while the UTXO pool only contains confirmed outputs.
=== Alert Messages

Loading…
Cancel
Save