ch06 edits 2

pull/217/head
Andreas M. Antonopoulos 8 years ago
parent 81e79e3fdb
commit 1192c97c8e

@ -325,7 +325,7 @@ The API returns a JSON object with the current fee estimate for fastest confirma
[[tx_fee_equation]]
.Transaction fees are implied, as the excess of inputs minus outputs:
----
Fees = Sum(Inputs) Sum(Outputs)
Fees = Sum(Inputs) -- Sum(Outputs)
----
This is a somewhat confusing element of transactions and an important point to understand, because if you are constructing your own transactions you must ensure you do not inadvertently include a very large fee by underspending the inputs. That means that you must account for all inputs, if necessary by creating change, or you will end up giving the miners a very big tip!
@ -343,16 +343,18 @@ Now let's look at a different scenario. Eugenia, our children's charity director
As Eugenia's wallet application tries to construct a single larger payment transaction, it must source from the available UTXO set, which is composed of many smaller amounts. That means that the resulting transaction will source from more than a hundred small-value UTXO as inputs and only one output, paying the book publisher. A transaction with that many inputs will be larger than one kilobyte, perhaps a kilobyte or several kilobytes in size. As a result, it will require a much higher fee than the median sized transaction.
Eugenia's wallet application will calculate the appropriate fee by measuring the size of the transaction and multiplying that by the per-kilobyte fee. Many wallets will overpay fees for larger transactions to ensure the transaction is processed promptly. The higher fee is not because Eugenia is spending more money, but because her transaction is more complex and larger in sizethe fee is independent of the transaction's bitcoin value.(((range="endofrange", startref="ix_ch06-asciidoc8")))(((range="endofrange", startref="ix_ch06-asciidoc7")))
Eugenia's wallet application will calculate the appropriate fee by measuring the size of the transaction and multiplying that by the per-kilobyte fee. Many wallets will overpay fees for larger transactions to ensure the transaction is processed promptly. The higher fee is not because Eugenia is spending more money, but because her transaction is more complex and larger in size--the fee is independent of the transaction's bitcoin value.(((range="endofrange", startref="ix_ch06-asciidoc8")))(((range="endofrange", startref="ix_ch06-asciidoc7")))
[[tx_script]]
=== Transaction Scripts and Script Language
((("scripts", id="ix_ch06-asciidoc9", range="startofrange")))((("transactions","script language for", id="ix_ch06-asciidoc10", range="startofrange")))((("transactions","validation", id="ix_ch06-asciidoc11", range="startofrange")))((("validation (transaction)", id="ix_ch06-asciidoc12", range="startofrange")))Bitcoin clients validate transactions by executing a script, written in a Forth-like scripting language, often referred to simply as _Script_. Both the locking script placed on a UTXO and the unlocking script are written in this scripting language. When a transaction is validated, the unlocking script in each input is executed alongside the corresponding locking script to see if it satisfies the spending condition.
((("scripts", id="ix_ch06-asciidoc9", range="startofrange")))((("transactions","script language for", id="ix_ch06-asciidoc10", range="startofrange")))((("transactions","validation", id="ix_ch06-asciidoc11", range="startofrange")))((("validation (transaction)", id="ix_ch06-asciidoc12", range="startofrange")))((("Script language", id="ix_ch06-asciidoc13", range="startofrange")))((("scripts","language for", id="ix_ch06-asciidoc14", range="startofrange")))The bitcoin transaction script language, called _Script_, is a Forth-like reverse-polish notation stack-based execution language. If that sounds like gibberish, you probably haven't studied 1960's programming languages, but that's ok - we will explain it all in this chapter. Both the locking script placed on a UTXO and the unlocking script are written in this scripting language. When a transaction is validated, the unlocking script in each input is executed alongside the corresponding locking script to see if it satisfies the spending condition.
Today, most transactions processed through the bitcoin network have the form "Payment to Bob's bitcoin address" and are based on a script called a Pay-to-Public-Key-Hash script. However, the use of scripts to lock outputs and unlock inputs means that through use of the programming language, transactions can contain an infinite number of conditions. Bitcoin transactions are not limited to the "Payment to Bob's bitcoin address" script.
Script is a very simple language that was designed to be limited in scope and executable on a range of hardware, perhaps as simple as an embedded device. It requires minimal processing and cannot do many of the fancy things modern programming languages can do. For its use in validating programmable money, this is a deliberate security feature.
In this section, we will demonstrate the components of the bitcoin transaction scripting language and show how it can be used to express complex conditions for spending and how those conditions can be satisfied by unlocking scripts.
Today, most transactions processed through the bitcoin network have the form "Payment to Bob's bitcoin address" and are based on a script called a Pay-to-Public-Key-Hash script. However, bitcoin transactions are not limited to the "Payment to Bob's bitcoin address" script. In fact, locking scripts can be written to express a vast variety of complex conditions. In order to understand these more complex scripts, we must first understand the basics of transaction scripts and script language.
In this section, we will demonstrate the basic components of the bitcoin transaction scripting language and show how it can be used to express simple conditions for spending and how those conditions can be satisfied by unlocking scripts.
[TIP]
====
@ -362,25 +364,24 @@ Bitcoin transaction validation is not based on a static pattern, but instead is
==== Turing Incompleteness
((("Script language","flow-control/loops in")))((("Script language","statelessness of")))((("Turing Complete")))The bitcoin transaction script language contains many operators, but is deliberately limited in one important waythere are no loops or complex flow control capabilities other than conditional flow control. This ensures that the language is not _Turing Complete_, meaning that scripts have limited complexity and predictable execution times. Script is not a general-purpose language. These limitations ensure that the language cannot be used to create an infinite loop or other form of "logic bomb" that could be embedded in a transaction in a way that causes a((("denial-of-service attack","Script language and"))) denial-of-service attack against the bitcoin network. Remember, every transaction is validated by every full node on the bitcoin network. A limited language prevents the transaction validation mechanism from being used as a vulnerability.
((("Script language","flow-control/loops in")))((("Script language","statelessness of")))((("Turing Complete")))The bitcoin transaction script language contains many operators, but is deliberately limited in one important way--there are no loops or complex flow control capabilities other than conditional flow control. This ensures that the language is not _Turing Complete_, meaning that scripts have limited complexity and predictable execution times. Script is not a general-purpose language. These limitations ensure that the language cannot be used to create an infinite loop or other form of "logic bomb" that could be embedded in a transaction in a way that causes a((("denial-of-service attack","Script language and"))) denial-of-service attack against the bitcoin network. Remember, every transaction is validated by every full node on the bitcoin network. A limited language prevents the transaction validation mechanism from being used as a vulnerability.
==== Stateless Verification
((("stateless verification of transactions")))((("transactions","statelessness of")))The bitcoin transaction script language is stateless, in that there is no state prior to execution of the script, or state saved after execution of the script. Therefore, all the information needed to execute a script is contained within the script. A script will predictably execute the same way on any system. If your system verifies a script, you can be sure that every other system in the bitcoin network will also verify the script, meaning that a valid transaction is valid for everyone and everyone knows this. This predictability of outcomes is an essential benefit of the bitcoin system.(((range="endofrange", startref="ix_ch06-asciidoc12")))(((range="endofrange", startref="ix_ch06-asciidoc11")))(((range="endofrange", startref="ix_ch06-asciidoc10")))(((range="endofrange", startref="ix_ch06-asciidoc9")))
==== Script Construction (Lock + Unlock)
((("scripts","construction of")))((("validation (transaction)","script construction for")))Bitcoin's transaction validation engine relies on two types of scripts to validate transactions: a locking script and an unlocking script.
((("locking scripts","transaction validation and")))((("validation (transaction)","locking scripts")))A locking script is an spending condition placed on an output: it specifies the conditions that must be met to spend the output in the future. Historically, the locking script was called a _scriptPubKey_, because it usually contained a public key or bitcoin address (public key hash). In this book we refer to it as a "locking script" to acknowledge the much broader range of possibilities of this scripting technology. In most bitcoin applications, what we refer to as a locking script will appear in the source code as +scriptPubKey+. You will also see the locking script referred to as a _witness script_ (see <<segwit>>) or more generally as a _cryptographic puzzle_. These terms all mean the same thing, at different levels of abstraction.
((("locking scripts","transaction validation and")))((("validation (transaction)","locking scripts")))A locking script is a spending condition placed on an output: it specifies the conditions that must be met to spend the output in the future. Historically, the locking script was called a _scriptPubKey_, because it usually contained a public key or bitcoin address (public key hash). In this book we refer to it as a "locking script" to acknowledge the much broader range of possibilities of this scripting technology. In most bitcoin applications, what we refer to as a locking script will appear in the source code as +scriptPubKey+. You will also see the locking script referred to as a _witness script_ (see <<segwit>>) or more generally as a _cryptographic puzzle_. These terms all mean the same thing, at different levels of abstraction.
((("unlocking scripts","transaction validation and")))An unlocking script is a script that "solves," or satisfies, the conditions placed on an output by a locking script and allows the output to be spent. Unlocking scripts are part of every transaction input. Most of the time they contain a digital signature produced by the user's wallet from his or her private key. Historically, the unlocking script is called _scriptSig_, because it usually contained a digital signature. In most bitcoin applications, the source code refers to the unlocking script as +scriptSig+. You will also see the unlocking script referred to as a _witness_ (see <<segwit>>). In this book, we refer to it as an "unlocking script" to acknowledge the much broader range of locking script requirements, because not all unlocking scripts must contain signatures.
Every bitcoin validating node will validate transactions by executing the locking and unlocking scripts together. For each input in the transaction, the validation software will first retrieve the UTXO referenced by the input. That UTXO contains a locking script defining the conditions required to spend it. The validation software will then take the unlocking script contained in the input that is attempting to spend this UTXO and execute the two scripts.
Every bitcoin validating node will validate transactions by executing the locking and unlocking scripts together. Each input contains an unlocking script and refers to a previously existing UTXO. The validation software will copy the unlocking script, retrieve the UTXO referenced by the input and copy the locking script from that UTXO. The unlocking and locking script are then executed in sequence. The input is valid if the unlocking script satisfies the locking script conditions (see <<script_exec>>). All the inputs are validated independently, as part of the overall validation of the transaction.
Note that the UTXO is permanently recorded in the blockchain, and therefore is invariable and is unaffected by failed attempts to spend it by reference in a new transaction. Only a valid transaction that correctly satisfies the conditions of the output results in the output being considered as "spent" and removed from the set of unspent transaction outputs (UTXO set).
Note that the UTXO is permanently recorded in the blockchain, and therefore is invariable and is unaffected by failed attempts to spend it by reference in a new transaction. Only a valid transaction that correctly satisfies the conditions of the output results in the output being considered as "spent" and removed from the set of unspent transaction outputs (UTXO set)((("UTXO Set", "removing outputs")))((("UTXO", "spending"))).
<<scriptSig_and_scriptPubKey>> is an example of the unlocking and locking scripts for the most common type of bitcoin transaction (a payment to a public key hash), showing the combined script resulting from the concatenation of the unlocking and locking scripts prior to script validation.
@ -388,11 +389,6 @@ Note that the UTXO is permanently recorded in the blockchain, and therefore is i
.Combining scriptSig and scriptPubKey to evaluate a transaction script
image::images/msbt_0501.png["scriptSig_and_scriptPubKey"]
[[tx_script_language]]
==== Scripting Language
((("Script language", id="ix_ch06-asciidoc13", range="startofrange")))((("scripts","language for", id="ix_ch06-asciidoc14", range="startofrange")))The bitcoin transaction script language, called Script, is a Forth-like reverse-polish notation stack-based execution language. If that sounds like gibberish, you probably haven't studied 1960's programming languages. Script is a very simple language that was designed to be limited in scope and executable on a range of hardware, perhaps as simple as an embedded device, such as a handheld calculator. It requires minimal processing and cannot do many of the fancy things modern programming languages can do. In the case of programmable money, that is a deliberate security feature.
===== The Script Execution Stack
Bitcoin's scripting language is called a stack-based language because it uses a data structure called a((("stack, defined"))) _stack_. A stack is a very simple data structure, which can be visualized as a stack of cards. A stack allows two operations: push and pop. Push adds an item on top of the stack. Pop removes the top item from the stack. Operations on a stack can only act on the top-most item on the stack. A stack data structure is also called a Last-In-First-Out, or "LIFO" queue.
@ -403,14 +399,9 @@ Conditional operators evaluate a condition, producing a boolean result of TRUE o
===== A Simple Script
In <<simplemath_script>>, the script +2 3 OP_ADD 5 OP_EQUAL+ demonstrates the arithmetic addition operator +OP_ADD+, adding two numbers and putting the result on the stack, followed by the conditional operator +OP_EQUAL+, which checks that the resulting sum is equal to +5+. For brevity, the +OP_+ prefix is omitted in the step-by-step example.
The following is a slightly more complex script, which calculates ++2 + 7 3 + 1++. Notice that when the script contains several operators in a row, the stack allows the results of one operator to be acted upon by the next operator:
Now let's apply what we've learned about scripts and stacks to some simple examples.
----
2 7 OP_ADD 3 OP_SUB 1 OP_ADD 7 OP_EQUAL
----
Try validating the preceding script yourself using pencil and paper. When the script execution ends, you should be left with the value TRUE on the stack.
In <<simplemath_script>>, the script +2 3 OP_ADD 5 OP_EQUAL+ demonstrates the arithmetic addition operator +OP_ADD+, adding two numbers and putting the result on the stack, followed by the conditional operator +OP_EQUAL+, which checks that the resulting sum is equal to +5+. For brevity, the +OP_+ prefix is omitted in the step-by-step example. For more details on the available script operators and functions, see <<tx_script_ops>>.
Although most locking scripts refer to a public key hash (essentially, a bitcoin address), thereby requiring proof of ownership to spend the funds, the script does not have to be that complex. Any combination of locking and unlocking scripts that results in a TRUE value is valid. The simple arithmetic we used as an example of the scripting language is also a valid locking script that can be used to lock a transaction output.
@ -442,6 +433,14 @@ image::images/msbt_0502.png["TxScriptSimpleMathExample"]
Transactions are valid if the top result on the stack is TRUE (noted as ++&#x7b;0x01&#x7d;++), any other non-zero value or if the stack is empty after script execution. Transactions are invalid if the top value on the stack is FALSE (a zero-length empty value, noted as ++&#x7b;&#x7d;++) or if script execution is halted explicitly by an operator, such as OP_VERIFY, OP_RETURN, or a conditional terminator such as OP_ENDIF. See <<tx_script_ops>> for details.
====
The following is a slightly more complex script, which calculates ++2 + 7 -- 3 + 1++. Notice that when the script contains several operators in a row, the stack allows the results of one operator to be acted upon by the next operator:
----
2 7 OP_ADD 3 OP_SUB 1 OP_ADD 7 OP_EQUAL
----
Try validating the preceding script yourself using pencil and paper. When the script execution ends, you should be left with the value TRUE on the stack.
[[script_exec]]
===== Separate Execution of Unlocking and Locking Scripts
In the original bitcoin client, the unlocking and locking scripts were concatenated and executed in sequence. For security reasons, this was changed in 2010, because of a vulnerability that allowed a malformed unlocking script to push data onto the stack an corrupt the locking script. In the current implementation, the scripts are executed separately with the stack transferred between the two executions, as described next.
@ -492,11 +491,11 @@ image::images/msbt_0504.png["Tx_Script_P2PubKeyHash_2"]
So far, we have not delved into any detail about "digital signatures". In this section we look at how digital signatures work and how they can present proof of ownership of a private key without revealing that private key.
The digital signature algorithm used in bitcoin is the _Elliptic Curve Digital Signature Algorithm_, or _ECDSA_. ECDSA is the algorithm used for digital signatures based on elliptic curve private/public key pairs, as described in <<ecc>>.
The digital signature algorithm used in bitcoin is the _Elliptic Curve Digital Signature Algorithm_, or _ECDSA_. ECDSA is the algorithm used for digital signatures based on elliptic curve private/public key pairs, as described in <<ecc>>. ECDSA is used by the script functions OP_CHECKSIG, OP_CHECKSIGVERIFY, OP_CHECKMULTISIG and OP_CHECKMULTISIGVERIFY. Any time you see those in a locking script, the unlocking script must contain an ECDSA signature.
A digital signature serves three purposes in bitcoin (see <<digital_signature_definition>>). First, the signature proves that the owner of the private key, who is by implication the owner of the funds, has *authorized* the spending of those funds. Secondly, the proof of authorization is *undeniable* (non-repudiation). Thirdly, the signature proves that the transaction (or specific parts of the transaction) have not and *can not be modified* by anyone other than the owner of the private key.
A digital signature serves three purposes in bitcoin (see <<digital_signature_definition>>). First, the signature proves that the owner of the private key, who is by implication the owner of the funds, has *authorized* the spending of those funds. Secondly, the proof of authorization is *undeniable* (non-repudiation). Thirdly, the signature proves that the transaction (or specific parts of the transaction) have not and *can not be modified* by anyone after it has been been signed.
Note that each transaction input is signed independently. This is critical, as neither the signatures, nor the inputs have to belong or be applied by the same "owners". In fact, a specific transaction scheme called "CoinJoin" uses this fact to create multi-party transactions for privacy.
Note that each transaction input is signed independently. This is critical, as neither the signatures, nor the inputs have to belong to or be applied by the same "owners". In fact, a specific transaction scheme called "CoinJoin" uses this fact to create multi-party transactions for privacy.
[NOTE]
====
@ -569,13 +568,13 @@ See if you can decode Alice's serialized (DER-encoded) signature using the guide
To verify the signature, one must have the signature (+R+ and +S+), the serialized transaction and the public key (that corresponds to the private key used to create the signature). Essentially, verification of a signature means "Only the owner of the private key that generated this public key could have produced this signature on this transaction".
The signature verification algorithm takes the message (transaction or parts of it), the signer's public key and the signature (+R+ and +S+ values) and returns TRUE if the signature is valid for this transaction and public key.
The signature verification algorithm takes the message (a hash of the transaction or parts of it), the signer's public key and the signature (+R+ and +S+ values) and returns TRUE if the signature is valid for this message and public key.
==== Signature Hash Types (SIGHASH)
Digital signatures are applied to messages, which in the case of bitcoin, are the transactions themselves. The signature implies a _commitment_ by the signer to specific transaction data. In the simplest form, the signature applies to the entire transaction, thereby committing all the inputs, outputs and other transaction fields. But, a signature can commit to only a subset of the data in a transaction, which is useful for a number of scenarios as we will see below.
Bitcoin signatures have a way of indicating which part of a transaction's data is included in the data signed by the transaction, through the use of a SIGHASH flag. The SIGHASH flag is a single byte that is appended to the signature.
Bitcoin signatures have a way of indicating which part of a transaction's data is included in the hash signed by the private key, through the use of a SIGHASH flag. The SIGHASH flag is a single byte that is appended to the signature. Every signature has a SIGHASH flag and the flag can be different from to input to input. A transaction with three signed inputs may have three signatures with different SIGHASH flags, each signature signing (committing) different parts of the transaction.
Remember, each input may contain a signature in its unlocking script. As a result, a transaction that contains several inputs may have signatures with different SIGHASH flags that commit different parts of the transaction in each of the inputs. Note also that bitcoin transactions may contain inputs from different "owners", who may sign only one input in a partially constructed (and invalid) transaction, collaborating with others to gather all the necessary signatures to make a valid transaction. Many of the SIGHASH flag types only make sense if you think of multiple participants collaborating outside the bitcoin network and updating a partially signed transaction.
@ -592,7 +591,7 @@ In addition, there is a modifier flag SIGHASH_ANYONECANPAY, which can be combine
|=======================
|SIGHASH flag| Value | Description
| SIGHASH_ALL\|ANYONECANPAY | 0x81 | Signature applies to one inputs and all outputs
| ALL\|ANYONECANPAY | 0x81 | Signature applies to one inputs and all outputs
| NONE\|ANYONECANPAY | 0x82 | Signature applies to one inputs, none of the outputs
| SINGLE\|ANYONECANPAY | 0x83 | Signature applies to one input & the output with the same index number
|=======================
@ -601,21 +600,26 @@ The way SIGHASH flags are applied during signing and verification, is that a cop
[NOTE]
====
All SIGHASH types sign the transaction nLocktime field. In addition, the SIGHASH type itself is appended to the transaction before it is signed, so that it can't be modified one signed.
All SIGHASH types sign the transaction nLocktime field (see <<locktime>>). In addition, the SIGHASH type itself is appended to the transaction before it is signed, so that it can't be modified once signed.
====
In the example of Alice's transaction (see <<decoded_alice_sig>>), we saw that the last part of the DER-encoded signature was +01+, which is the SIGHASH_ALL flag. This locks the transaction data,so Alice's signature is committing the state of all inputs and outputs. This is the most common signature form.
In the example of Alice's transaction (see <<decoded_alice_sig>>), we saw that the last part of the DER-encoded signature was +01+, which is the SIGHASH_ALL flag. This locks the transaction data, so Alice's signature is committing the state of all inputs and outputs. This is the most common signature form.
Let's look at some of the other SIGHASH types and how they can be used in practice:
ALL|ANYONECANPAY :: This construction can be used to make a "crowdfunding"-style transaction. Someone attempting to raise funds can construct a transaction with a single output. The single output pays the "goal" amount to the fundraiser. Such a transaction, is obviously not valid, as it has no inputs. However, others can now amend it by adding an input of their own, as a donation. They sign their own input with ALL|ANYONECANPAY. Unless enough inputs are gathered to reach the value of the output, the transaction is invalid. Each donation is a "pledge", which cannot be collected by the fundraiser until the entire goal amount is raised.
NONE :: This construction can be used to create a "bearer-check" of a specific amount. It commits to the input, but allows the output locking script to be changed. Anyone can write their own bitcoin address into the output locking script and redeem the transaction. However, the output value itself is locked by the signature.
NONE :: This construction can be used to create a "bearer-check" or "blank check" of a specific amount. It commits to the input, but allows the output locking script to be changed. Anyone can write their own bitcoin address into the output locking script and redeem the transaction. However, the output value itself is locked by the signature.
NONE|ANYONECANPAY :: This constuction can be used to build a "dust collector". User's who have tiny UTXO in their wallets can't spend these without the cost in fees exceeding the value of the dust. With this type of signature, the dust UTXO can be donated, for anyone to aggregate and spend whenever they want.
NONE|ANYONECANPAY :: This construction can be used to build a "dust collector". User's who have tiny UTXO in their wallets can't spend these without the cost in fees exceeding the value of the dust. With this type of signature, the dust UTXO can be donated, for anyone to aggregate and spend whenever they want.
There are some proposals to modify or expand the SIGHASH system. One such proposal is _Bitmask Sighash Modes_ by Blockstream's Glenn Willen, as part of the Elements project. This aims to create a flexible replacement for SIGHASH types that allows "arbitrary, miner-rewritable bitmasks of inputs and outputs" that can express "more complex contractual pre-commitment schemes, such as signed offers with change in a distributed asset exchange."
[NOTE]
====
You will not see SIGHASH flags presented as an option in a user's wallet application. With few exceptions, wallets construct P2PKH scripts and sign with SIGHASH_ALL flags. To use a different SIGHASH flag, you would have to write software to construct and sign transactions. More importantly, SIGHASH flags can be used by special purpose bitcoin applications that enable novel uses.
====
[[ecdsa_math]]
==== ECDSA Math
@ -664,10 +668,10 @@ As we saw in <<ecdsa_math>>, the signature generation algorithm uses a random ke
[WARNING]
====
If the same value +k+ is used in the signing algorithm on two different transactions, the private key can be calculated and is exposed!
If the same value +k+ is used in the signing algorithm on two different transactions, the private key can be calculated and exposed to the world!
====
This is not just a theoretical possibility. We have seen this issue lead to exposure of private keys in a few different implementations of transaction signing algorithms in bitcoin. People have lost funds to attackes because of inadvertent re-use of a +k+ value. The most common reason for re-use of a +k+ value is an improperly initialized random-number generator.
This is not just a theoretical possibility. We have seen this issue lead to exposure of private keys in a few different implementations of transaction signing algorithms in bitcoin. People have had funds stolen because of inadvertent re-use of a +k+ value. The most common reason for re-use of a +k+ value is an improperly initialized random-number generator.
To avoid this vulnerability, the industry best practice is to not generate +k+ with a random-number generator seeded with entropy, but instead to use a deterministic-random process seeded with the transaction data itself. That ensures that each transaction produces a different +k+. The industry-standard algorithm for deterministic initialization of +k+ is defined in https://tools.ietf.org/html/rfc6979[RFC 6979] published by the Internet Engineering Task Force.

Loading…
Cancel
Save