Deep dive into Ethereum logs

Hey kids, today we are going low level trying to understand how Ethereum events and logs work. Put web3 away for a while as it abstracts things and we’d like to get as bare as it gets.

Smart contracts generate logs by firing events. Here’s a transaction receipt that contains one log entry.

The log entry consists of one topic and a data field. The first topic refers to the specific event, but we don’t know which one yet. To decode the data we need to obtain ABI of the contract found in the address field.

We are interested in items with the type event. Here’s a sample event.

{  
  "anonymous": false,  
  "inputs": [  
    {  
      "indexed": false,  
      "name": "from",  
      "type": "address"  
    },  
    {  
      "indexed": false,  
      "name": "to",  
      "type": "address"  
    },  
    {  
      "indexed": false,  
      "name": "tokenId",  
      "type": "uint256"  
    }  
  ],  
  "name": "Transfer",  
  "type": "event"  
}

To find out which event the topic refers to we need to compute a signature of each event and find the matching one. Signature is a keccak hash of event name and input argument types, argument names are ignored. For event Hello(uint256 worldId) the signature will be keccak('Hello(uint256)').

> keccak('Pregnant(address,uint256,uint256,uint256)')
'0x241ea03ca20251805084d27d4440371c34a0b85ff108f6bb5611248f73818b80'

> keccak('Transfer(address,address,uint256)')
'0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef'

> keccak('Approval(address,address,uint256)')
'0x8c5be1e5ebec7d5bd14f71427d1e84f3dd0314c0f7b2291e5b200ac8c7c3b925'

> keccak('Birth(address,uint256,uint256,uint256,uint256)')
'0x0a5311bd2a6608f08a180df2ee7c5946819a649b204b554bb8e39825b2c50ad5'

> keccak('ContractUpgrade(address)')
'0x450db8da6efbe9c22f2347f7c2021231df1fc58d3ae9a2fa75d39fa446199305'

As we can see, the signature of Transfer event matches the first topic. Now we can decode the data.

> types = [i['type'] for i in e['inputs']]
['address', 'address', 'uint256']

> names = [i['name'] for i in e['inputs']]
['from', 'to', 'tokenId']

> values = eth_abi.decode_abi(types, log['data'])
('0x0035fc5208ef989c28d47e552e92b0c507d2b318',
 '0x646985c36ad7bf4f3a91283f3ea6eda2af79fac6',
 107696)

> dict(zip(names, values))
{'from': '0x0035fc5208ef989c28d47e552e92b0c507d2b318',
 'to': '0x646985c36ad7bf4f3a91283f3ea6eda2af79fac6',
 'tokenId': 107696}

We can finally read the log. The message says:

“Transfer from 0x0035… to 0x6469… of token 107696”.

Indexed fields

Each indexed field generates a new topic which is excluded from the data. This allows for efficient search but makes parsing a bit more complicated.

Let’s look at another receipt:

This one has three topic fields and a small data field. We can already tell that the first topic is Transfer(address,address,uint256) despite the different argument names in this contract.

{
  "anonymous": false,
  "inputs": [
    {
      "indexed": true,
      "name": "from",
      "type": "address"
    },
    {
      "indexed": true,
      "name": "to",
      "type": "address"
    },
    {
      "indexed": false,
      "name": "value",
      "type": "uint256"
    }
  ],
  "name": "Transfer",
  "type": "event"
}

The remaining two topics are simply indexed inputs. The only value in data is the remaining uint256.

> types = [i['type'] for i in e['inputs'] if not i['indexed']]
['uint256']

> names = [i['name'] for i in e['inputs'] if not i['indexed']]
['value']

> values = eth_abi.decode_abi(types, log['data'])
(5000000000,)

> indexed_types = [i['type'] for i in e['inputs'] if i['indexed']]
['address', 'address']

> indexed_names = [i['name'] for i in e['inputs'] if i['indexed']]
['from', 'to']

> indexed_values = [eth_abi.decode_single(t, v) for t, v in zip(indexed_types, log['topics'][1:])]
['0x00b46c2526e227482e2ebb8f4c69e4674d262e75',
 '0x54a2d42a40f51259dedd1978f6c118a0f0eff078']

> dict(chain(zip(names, values), zip(indexed_names, indexed_values)))
{'from': '0x00b46c2526e227482e2ebb8f4c69e4674d262e75',
 'to': '0x54a2d42a40f51259dedd1978f6c118a0f0eff078',
 'value': 5000000000}

Note that Ethereum always uses integers to represent numeric values, so we got the value in the minimal denomination. Move the decimal point left by decimals found in the contract, 3 in this case. The message reads:

“Transfer from 0x00b4… to 0x54a2… of 5,000,000 tokens”.

Querying

Now you are ready to do some search. You can query the blockchain using json-rpc api which is provided by full nodes like geth or Parity or a service like Infura. Bloom filters allow you to scan the entire blockchain in seconds and find logs matching the specific topic.

You can also specify search range with fromBlock and toBlock, limit the search to specific contract address and omit some of the topics with null which works like a wildcard. The full specification can be found here.

Of course, everything described here is already implemented in web3 which allows you to conveniently query the events by name and decodes logs data automatically, but if you care about performance like I do, this approach could save you a lot of time.

If you learned something new, clap and subscribe to crypto eli5.