Available under at Creative Commons CC-BY-NC-ND license and readable at Mastering Ethereum
-
-
Save aquaflamingo/9dec538df64cc64bb4944cba78be76b8 to your computer and use it in GitHub Desktop.
Ethereum is:
- unbounded (turing complete)
- determinisic
- a state machine
- globally accessible singleton Smart contracts are computer programs that provide:
- highly availability
- auditability
- transparency
- neutrality
- reduced counter party risk
- censorship resitance
- P2P network that propegates transactions and blocks
- Messages that represent state transitions
- Consensus rules governing valitiy of state transitions
- A state machine processing the transactions
- Chain of cryptgrapphically secure blocks
- Consensus algorithm with decentralized control
- Game theoretic incentive scheme to maintain the ledger
- Multiple open source client implementations
- Ethereum is general purpose it tracks the changes in code and data overtime
- Ethereum runs a virtual machine to execute code called the Ethereum Virtual Machine
- The state data for etheruem is tored in a serialized hashed data structure called a Merkle Patricia Tree
- Ethereum uses a metered resource called gas to avoid infinite loops
- Each instruction has a cost in gas
- transactions specify the max gas limit which if the EVM reaches
- Tokens can represent:
- Currency
- Resource Claims
- Equity
- Voting
- Access
- Collectibles
- Identity
- Attestation (Signatures)
- Utility (Payment)
- Can be fungible (non-unique) or non-fungible (unique)
- Some tokens are native to the blockchain and therefore have the lowest counter party risk
- e.g. Purely blockchain based tokens, opposed to tokens which bridge chain to real world
- One critica role of token is converting extrinsic assets into intrinisc (native) assets
- ETH transactions are handled at the protocol level whereas Tokens are processed at the contract level
- Utility Tokens: Use of token is required to access a resource
- Equity Tokens: Represents a control or ownership of something
- A fungible token standard
- Is a common interface for tokens
- (functions):
totalSupply,balanceOf,transfer,transferFrom,approve,allowance - (events):
Transfer,Approve
- (functions):
- A token contract is essentially two internal mappings:
- Balances: `address=>int)
- Allowances:
address=>(address=>int))// used for delegation
- Developers should be mindful in programming to accept ERC20 tokens, because if the contract does not have ERC20 functionality the tokens will be lost forever -- approx ~$2.5 USD lost this way so far
- Nuances:
- Unlike ETH, there are no receipient addresses
- The token contract manages all the mappings and state of accounts
- Tokens are send via the
TokenContract.transfer, notsend - Sending tokens requires ETH
- Unlike ETH, there are no receipient addresses
- Another standard to prevent sending tokens to ERC20 incompatible contracts
- A check to determine whether the destination address is a contract or not
- e.g. does it have a
tokenFallbackfunction
- e.g. does it have a
- A check to determine whether the destination address is a contract or not
- Support for an extended ERC20 interface, with support for ERC820 token registraries
- Includes
- various "hooks" for receiving notifications
- proxy support
- Metadata
- "Deed" contracts representing ownership of a unique thing
- Mapping has an "owner" rather than a balance
- Optional:
MetaDataInterface(indexable interface)
- Standards exists to promote interopeability
- Additional extensions include:
- Burning
- Minting
- White / Black listing
- Ownership
- Crowd funding
- Caps
- Recovery
- Oracles Provide external data source to smart contracts
- Oracles are needed becasue the EVM execution must be deterministic
- There is no randomness interally
- External data can only be included in the
datapayload
- Oracles "act as a bridge" to extend the real world or chain world
- Data uses include:
- Randomness
- Exchange Rates
- Political Events
- Geolocations
- Flights Stats
- Functionally:
- Collecting data from an offchain source
- Transfering data on chain
- Making data available by placing in storage for contracts to access
- Immediate Request: a one time query for a value (e.g. is over 21?)
- Pub-Sub: An RSS feed type that pulls data from offchain
- Request Response:
- Data set is too large to be stored on chain
- Can world via requesting the oracle and observing result as an EVM state change
There are a few ways to prove authenticity of an oracle's data
- Authenticity Proofs: Cryptographic guarantees that data has not been tampered with
- Oracalize's TLSNotary Proof (proof of HTTPs communication between client and server
- Oracles that perform infeasible on chain computation
- e.g. Oracalize deploys a user configred docker container to AWS which performs computation and writes values to
stdoutto return to the requesting dapp- Centralized by auditable
- Cryptlets are a standard
- Web3 encompasses and attempts to decentralize: messaging, storage and naming
- Core benefits of DApps
- Resiliency
- Transparency
- Censorship Resistance
- Business Logic = Smart contracts
- Messaging = Whisper
- Storage = IPFS/Swarm
- Domain Name System
- Operates nodes creating by the namehash algorithm
- EIP137
- Only nodes can set their names and subdomains
- Nodes are traversed by recursively hashing and resolving the higher domain:
keccack(keccack(0x00 + keccack("eth")+keccack("example"))
- Currently the root TLD (which is required to resolve anything is owned by a 4-7 multisig
- Resolver contracts are user-controlled and set-able by domain owners to resolve metadata an information about domain
- Names are distributed through a vickory auction (Sealed bids)
- Winner pays second highest big
- In ENS:
- User's lock funds
- Must submit two transactions (submit bid + commit/reveal)
- Reveal bid or risk fund loss
- Domain names lock an amount of ETH as a commitment for one year
- Resolver contract addresses are returned after ENS hashes the node (if it exists)
- There is a default resolve if none is supplied which can point to wallets or swarn addressess
- EVM is a "quasi" turning complete state machine
- Stack Based
- 256 bit words
- ROM for code storage
- Has no scheduling capacity (single threaded)
- Role of EVM is to update state by valid state transitions
- Ethereum World State is a mapping of 160bit address to accounts
- Accounts represent:
- Balance (ETH)
- Nonce
- (Contracts) Storage
- (Contracts) Code
- When a contract is executed a "sandboxed" EVM is instantiated and performs the computation
- In the event of a reversion, the EVM is destructed and no state change occurs to the global state
- If the code complete, the global state machine copies the state change occuring from the sandboxed EVM
- When creating a new contract the code in the
datafield is used to instantiate an EVM instance with the data loaded into program ROM
- When a transaction is sent to an ABI compatible contract
- Dispatcher is called to read the data field
- The first 4 bytes are extracted
- The stack machine then evaluates each function signature (4 bytes of keccak hash) searching for prototype
- jumps to that execution if it exists otherwise exits with
STOP
- Ethereum gees are calculated based on computation
- Gas PRICE: is exchange rate in units of gas per ETH
- Gas COST: is the total amount of units consumer for the operation
- Deleting stored values is encouraged and therefore gas is refunded for doing so
- The Block gas limit is the maximum number of gas that can be consumer in a single block
- ~ 8 million or 380 send transactions (21k gas)
- Miners can vote to increase or decrease the gas limit based on network demands
- Proof of Work (POW): ETHash
- Hashimoto-Dagger algorithm
- Generates a larger DAG structure to make hashing memory hard (ASIC resistance)
- Proof of Stake (POS): Casper
- Small deposit of ETH is made to become a validator and propose blocks
- Lose stake if block is rejected
- Earn rewards proptional to stake if block accepted
- Ethereum has a native currency called ether, ETH, Ξ, (greek Xi)
- The smallest unit is Wei 1018 which is used everywhere interally
- gwei = gigawei = 109
- Ethereum is account based rather than UTXO based (Bitcoin)
- Two types of accounts in Ethereum:
- Externally Owned Accounts: Does not have contract code, has a private key
- Contract Accounts: Contain executable code, does not have a private key
- Contracts are owned by their logic (e.g. they can set whoever is the owner)
- They have no private key so they cannot initiate state transitions, but can react to them
- When a transaction is sent to a contract account, the EVM is run using the transactions *data payload.
- transaction objects specify function names in their data field that is used to run the contract
- msg object is one of the input accessible by a smart contract
- it is part of the transaction that triggers execution of the contract
- fallback function
- this function is executed when a transaction is made without specifying the function to be called
- it acts as the "fallback" and is used to accept ether deposits (as long as it's marked payable)
- Contract creation is registered by creating a transaction with destination address to the zero address. (0x0).
- When a contract sends some of it's balance to an EoA it is considered an "internal transaction" message (you can see the difference on Etherscan)
- Ethereum clients are interoprable so long as they follow the same reference specification (e.g. Yellow paper)
- Remote Clients: MyCrypto, MetaMask communicate with existing networks
- Light Clients: Simple Payment Verification (SPV), validate block headers and merkle proofs to verify a transaction is included in a block
- Fullnodes: Download and validate the entire blockchain
- The fullnode is set to prune in default mode and is therefore ~80-100GB
- A fuly archival node is 1TB
- Clients include: Parity (Rust), Geth (Go)
- DDue to a DoS attack @ block 2283397 fullnode syncing will stall until block 2700000
- Fast syncing is available via
--fast
- HTTP Service offered on port
8545 - Client libraries essentially stub and construct RPC calls in JSON format under the hood
- RPC Calls include
jsonrpc: version (usually 2.0)
method: method to call
params: arguments to pass along
id: used to correlate calls (usually for batching calls)
- Ethereum uses digital signatures to authorize the movements of funds - the "crypto" in cryptocurrencies
- Fundamental property of cryptography: easy to go one way, hard to compute the inverse
- This enables digital signatures and secrets
- Trap door functions: crypto functions that are hard to invert unless one has a secret piece of information
- Discrete logrithm problem: is a difficult problem and is needed to undo elliptic curve multiplication in Elliptic Curve Cryptography (ECC)
- Results a good property which is elliptic curve multiplication is easy but opposite is hard
-
The search space of Ethereum private keys is
1 - 2<sup>256></sup>(256 bits or 64 hexadecimal digits) -
To create a
privKeywe feed a large random number (entropy) into thekeccak256hashing function- Aside:
keccak256preceededsha3but the NIST did not finish standardizingsha3during Ethereum's development, which is whykeccak256is used
- Aside:
-
Creating a Public key:
K (pubKey) = k (privKey) * G (generator point constant)
Note *: denotes elliptic curve multiplication not arithmatic, it is only one way
The reverse action is to find the discrete logirithm
-
Ethereum uses the
secp256k1curve (generator pointG)- Note that
Gis constant in Ethereum therefore the samepubKeyis always generated
- Note that
-
ECC Addition:
-
ECC Multiplication:
- Arithmatically: d x P = P1 + P2 + P3 ... + Pd-1 + Pd (e.g. addition d times)
-
The Public Key (
K) and Private Key (k) are just points on the curve: e.g. K = (01234567,09876542) -
Ethereum only uses uncompressed keys (0x04 prefix)
- Addressess are unique identifiers derived from public keys using the
keccak256hasing function and then taking the last20 bytespubKey: (x,y) -> concatenated xy -> keccak256 -> last 20bytes
- By default addresses are not check sumed
- EIP55 introducing check sums
- Addressess are compatible with ICAP protocol (which is compatible with IBAN, bank name protocol)
- Wallets are a basic data structure to manage a user's keys
- ! Wallets only hold keys not tokens
- Just a bunch of keys (JBOK)s in a store
- geth generates a
keystore.jsonfile: a JSON payload of the private key- uses a password stretching function ("KDF", Key derivation function)
- repeatedly hashes to stretch thereby preventing brute force/ rainbow table attacks
- (e.g need to hash
ntimes for each password attempt)
- Keys are related from a common seed.
- Use a key derivation algorithm to derive, the most common being BIP32/44
- HD Wallet (BIP32)
- A tree datastructure where parents derive child keys using an key derivation algorith,
- Allows an organized structure for keys
- Allows child public keys to be generated from children (creating watch only addresses from xpub keys)
- A tree datastructure where parents derive child keys using an key derivation algorith,
- BIP39: Mnemonics
- Mnemonic words are used to generate a seed
- A word represents a random number used to construct a seed (entropy mapped to 11 bit map to words)
Ex. (not accurate)
00000000000 army
00000000001 abandon
- Entropy of mnemonic (128-256 bits) is stretched to 512 bits via PBKDF2 using arguments
mneonmicandsalt- Seed is completed after 2048 rounds of hashing with HMA-SHA512 to prevent brute forcing
- the
saltis an optional passphrase which if incorrect (frmo the original) leads to a different set of keys- Good for a second factor security
- An "under durress" walet
- BIP43
- A standard to include a "purpose" as the first child hardened key for the derivation paths
- Allowing orginzation and specific types of HD wallet trees
- e.g.
m/43'is a different tree structure from anm/44'
- BIP44
- Extension pf BIP43"
m / purpose' / coin' / account' / is_change / address_index
m= master private keyM= master public key- Ethereum is
m/44'/60'/0/0*/index(* Ethereum is always receiving) - The structure let's us create extended public keys too, allowing downstream derivation of child keys
Parent Pub -
\
Child Priv --> Child public
- Great for Cold Storage
- Transaction are signed messages originating from an Externally Owned Account (EoA)
- They are the only way to mutate state of Ethereum
- Structure
- Nonce: A sequence number by the original EoA (for replay protection)
- Gas Price: The price in (ETH) willing to pay for gas
- Gas Limit: The maximum gas to buy for this transaction
- Recipient: The destination of the transaction
- Value: The value contained
- Data: A bytecode payload
- v, r , s: ECDSA Signatures used to derive the public key
- Transaction are serialized with Reverse Length Prefix (RLP) format, which has no delimiters or labels.
Nonce is a dynamically calculated value of the number of confirmed transactions from the originating EoA
- Allows proper transaction ordering → Nonce too high is stuck in mempool ignored
- Prevents replay attacks on transactions
- You can fetch the transaction count via
getTransactionCount - Beware of nonce gaps:
- e.g. Two transactions Txnonce 0 & Txnonce 2.
- Txnonce 2 is stuck in mempool until node receives the preceding Txnonce 1.
gas is a separate virtual currency that is NOT Ether that is paid for computation in fees to miners
- Gas has it's own exchange rate per Ether
- It allows separation from the volatility of ether value
- Gas Limit: is the gas unit order size
- Gas Price: Wei per gas unit
- Simple payment transactions require 21,000 gas
- Contract executions are variable and not preset
- A good analogy:
- A credit account for the gas station with payment only sent on completion
There are three basic types of transactions:
- Value only (Payment)
- Data only (Invocation)
- Value & Data (Payment and Invocation)
valueto EoA: Updates the state of destination address with value sentvalueto Contract:- Calls function in
datapayload - Fallback function if no function specified in
datapayload - Increase the contract balance if no fallback function is defined
- Calls function in
datato EoA: Usually not interpreteddatato Contract: calls function specified indatapayload.datatransactions are subject to consensus rules
- The
datapayload of a transaction is a hex serialization of:- Function selection → first 4 bytes of
keccack256hash of function prototype [e.g.withdraw(uint amount)] - Arguments → hex encoded to Application Binary Interface (ABI) spec e.g.
- Function selection → first 4 bytes of
|_ _ _ _ _ _ _ _|_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _|_ _ ... _ _|
8 bytes. 32 bytes variable
1. Padding 2.
- Contract creation is a special kind of transaction at destination address
0x0(zero address)- The transaction contains the compiled bytecode as
data
- The transaction contains the compiled bytecode as
Digital signatures perform three functions:
- proves owner of private keys (and as a result he autorization to spend or execute functions)
- Proof of authorization is undeniable
- Proof that data has not / cannot be modified
-
The signature in Ethereum is:
- the
keccak256hash of RLP encodeddatafor the transaction
Sig = Fsig(Fkeccack256(m),k) = (r,s) where Fsig is the signature algorithm Fkeccack256 is keccack256 hashing function
mis the message to hash (transaction)kis the private keyris the resultingxcoordinate of the ephemeral private key created during the signature process - the
-
To verify signatures one needs:
- The public key that "signed" the message
- The serialized transaction to verify
- The signature parameters
(r,s)
Algorithm::create_ethereum_signed_tx
1. Construct a transaction, tx, with:
data
gasLimit
gasPrice
nonce
to
value
chainID
0,0
2. Serialize the transaction in RLP format:
rlp_encoded_tx = rlp_encode(tx)
3. Take the keccack256 hash of the payload in (2):
hash = keccack256(rlp_encoded_tx)
4. Sign the hash with an EoA private key
sig = sign(hash, privKey)
5. Append the v, r, and s values to tx
(The tx is now signed)
-
EIP155: Replay Protection
- The parameters
chainID,0,0included in the original unsignedtxpreventschainIDtampering and attestation as to the network which the transaction was broadcast and intended
- The parameters
-
Public Key Recovery (
v) Parameter- When we compute the recovery of a signers public key, the curve is symmetric, so we can compute two possible keys
R, andR' - So avoid this duplicate work, the signing algorithm returns
vwhere:- if
vis odd, useR' - else use
R
- if
- When we compute the recovery of a signers public key, the curve is symmetric, so we can compute two possible keys
- A *smart contract was originally defined by Nick Szabo to be
A set of promises specified in digital form in which the parties perform on eachother's promises
- In Ethereum:
Immutable deterministic computer programs that run in a limited context of the Ethereum Virtual Machine (EVM) on a world computer (e.g. propegated to a global state)
- Unlike EoAs, contracts do not have private keys and can only run when invoked by a transaction.
- Contracts cannot run in parallel since Ethereum s a single threaded state machine
- Transactions are atomic with state changes recorded only if execution is successful (e.g. there is no error)
- Errors trigger state rollbacks reverting all state changes in value except for gas fees
- Contracts cannot be modfied but can be deleted via op code:
SELFDESTRUCT- The original program must have this functionality programmed in however.
- The EVM runs EVM Bytecode
- Coding in ethereum should prefer functional (declarative) programming over imparative (procedural)
- Solidity is procedural and the de facto language today
- Application Binary Interface: defined the contract for how data structures and function are accessed by machine code
- Defines the functions in a contract that can be invoked
- Compiler directive:
pragma 0.4.24tells what the acceptable compiler is^0.4.24indicates that any minor version is accesible but no major versions (e.g.0.5is disallowed)
-
msg.sender: Originating caller -
msg.data: data payload -
msg.value: Ether sent in Wei -
msg.sig: first 4 bytes of data -
msg.gasLeft: gas remaining -
tx.gasPrice: Gas price for the transaction -
tx.origin: Originating EoA -
block.blockhash: Originating caller -
block.coinbase: Coinbase address for fees and rewards -
block.difficulity: Current PoW of block -
block.gasLimit: Max gas for all transactions to fit in block -
block.number: Current block height -
block.timestamp: Unix epoch timestamp in seconds -
address.balance: Gets balance of address in wei -
address.transfer: Attempts to transfer value passed, throws error if fails -
address.call: Low level CALL, arbitrary msg withdatapayload -
address.delegatecall: Like call code but with msg context (libraries) -
address.send: Attempts to send value passed in, returns false if failed -
address.callcode: CALLCODE function replacing this addresses' contract code
public: Callable from any EoA or contractprivate: Not collable from derived contractsexternal: Only callable with explicitthisinternal: Only callable from derived contracts (protected)
constant/view: Promises not to write (modify) state but can readpayable: Function can accept incoming paymentspure: Function cannot read or write to state (declarative)
- Created via
construtorkeyword - Destruction with
SELFDESTRUCT
- Modifiers place constraints on the function execution
- They substitude code before execution of a function body:
onlyOwner {
require(msg.sender == owner)
_; // This placeholder is replaced with code by owner
}
- Errors revert all state changes
requireis used as a gateway conditionassert/throwis used to halt execution and state changedtransferwill automatically throw if not enough ETH is present
- A transaction receipt is given when a tx completes
- these contain
logswhich can be watched
- these contain
- Can create contracts via
newoperator which returns an object - Specify value of the contract on creation:
faucet = (new Faucet).value(0.5 ether)
- You can cast an address as a contract
- `faucet = Faucet(_f)
- Can directly
callmethods viacall- Uses direct opcodes of EVM
_faucet.call("withdraw", 0.1 eth)- Risk of re-entrancy attack
delegatecallkeeps themsgcontext constant so thatmsg.senderis the same - for librariescallin comparison modifies themsg.sender(e.g. different execution context)
- Vyper takes one step closer to declaritive programming than solidity
- Types of Problematic Contracts Vyper tries to address
- Suicidal Contracts: Contracts can at times be arbitrarily destroyed
- Greegy Contracts: Contracts that can get into "unreleasable" or "unusable" states
- Prodigal Contracts: Contracts that can release funds to arbitrary addressess
- Vyper removes modifiers as it changes the context of code execution
- Uses inline confirmation and assertion checks instead
- Also forces state modifications to be explicit in the contracts
- Vyper has no inheritance
- Vyper has no inline assembly
- Vyper allows explicit type casting via
convert()which calls through a conversion table but no implicit casting - Vyper outputs LLL (Low Level Lisp Language) code to be compiled into bytecode for the EVM
- Defensive programming
- Use well tested code libraries
- Prefer minimalistic and simple implementations
- There is a high quality bar due to immutability
- Write in a readable and auditable way
- Focus on extensive test coverage
- Sending Ether to an unknown address.
- a
payablefallback function from a malicious contract address can execute a vulnerable execution path causing it to "re-enter" the contract
- Use the Check-Effect_Interaction pattern
- Use Mutex's to lock state
- Prefer the
transferfunction oversend
- a fixed size variable is used to store data but the value is outside the range causing wrap arounds for numbers and unexpected behaviour
- Underflow: Value is under the storage range
uint my_var = 0
my_var - 1 // = 255
- Overflow: Value is over the storage range
- Using
SafeMath
SELFDESTRUCTopcode forcibly sends ethere to a contract (e.g. malicious contract destruction sends ether to target)- Contract relies on "in-variant" values like
this.balance
- Self defined variable to track balance rather than relying on
this.balance
- Context perserving
delegatecallis misused -- particularly to exploit how Ethereum manages storage with slots.- Contracts store data in
slot, e.g. library's first variable isslot[0]
- Contracts store data in
- Malicious contract sets it's address as a receiving address or library address for a critical point of business logic.
- Because
delegatecallperserves context, slot values can be overwritten
- Because
- Using
librarykeyword explicitly for contracts - Avoiding state bearing contracts, and non self-destructing
- Performing floating point arithematic with integers causing issues like under or overflows, or lack of precision
- Allow large numerators in fractions
- Be mindful of order of operations
- Convert to higher precision, do the math, then convert down
- Visibility is not specified resulting in default
publicand accesible functions
- Explicity setting modifiers in code
- Contracts require some sort of randomness to function and mistakeningly use psuedo random values like block hashes or numbers
- There is no uncertainty in Ethereum since execution is deterministic
- External sources of randomness (like oracles)
- Solidity contracts are casted to an address but the address might not contain the intended functions implemented correctly
- Malicious contracts could implement the same interface and inject themselves through
constructor
- Initializing contracts with
newat deployment time - Hardcoding contract addressess
- Third party apps do not validate input to the contracts
- EVM fills short addresses with zeros which can multiply the trailing value, e.g.:
|<-- address -->|<-- value -->|
|<-- addr -->|<-- value -->000| // zero's filled in resulting in much larger value
- Validating input parameters
- Developer does not revert execution based on failed
sendcallsendreturnsfalseif it fails and does not revert, unliketransfer
- Prefer
transferfunction where possible - Use the
withdrawpattern
- User's must call the
withdrawfunction that handles sending, and reversions on failure
- A malicious user duplicate a transaction payload with a higher
gasPriceto "front-run" existing transactions- Think exchanges, lottery guessing games, etc
- Be wary of both malicious users and miners
- Add explicit upper bounds to
gasLimit - Implement "commit-reveal schemes"
- Send a hidden data value and only reveal once included in a block
- Use "submarine sends" via
CREATE2opcode
- A contract loops through a large mapping or array, or a malicious user intentionally creates a huge array (can sometimes exceed block's
gasLimit) - Keys are lost to priviledged account (e.g. EoA of
owneraccount is lost)
- Provide "pull" rather than "push" operations to withdraw funds (e.g.
withdrawvs.distribute) - Add lock times or multi sig so funds are not lost forever
- Malicious miners change the timestamp to exploit a contract depending on it
- Use block numbers instead of timestamps for entropy
- Old
constructorleft from Solidity0.4.22which use the contract name instead of keyword are publically accessible and invokable functions
- Using
0.4.22Solidity compiler which uses theconstructorkeyword
- Solidity defaults local variables to
storage(opposed tomemory). So an object referencesslot[0]and can overwrite values
- Solidity lint for pointers
- Explicit
memoryuse
