Minimal Proxy Compendium

The history of minimal proxies and how to scan the blockchain on your laptop in seconds

ethereum
data
Author

banteg

Published

September 12, 2023

PART I: Proxy Types

While working on adding support for more proxy types to Ape, I got sucked into some of the most incredible onchain archeology. Follow along and you will be able to repeat this journey even on your laptop.

What is a Proxy

A proxy is a tiny contract, often in hand-written assembly, that makes it cheap to deploy instances of a logic contract while making it slightly more expensive to call them. Proxies are good citizens, they don’t clutter the blockchain state and they make creating a pool as affordable as making a trade.

They work by using a special type of call named delegatecall that keeps the execution context inside the proxy while executing the code of the implementation contract. This allows the code from the logic contract to be used as a library and interact with the storage of the proxy contract.

Let’s look at different proxies as they were invented.

Vyper Beta

Unlike Solidity that still lacks this feature, Vyper has offered a safe way of creating minimal proxies since the very first beta. The most widely used deployment of these pre-standard proxies can be found in Uniswap v1 factory contract, which is responsible for the majority of 4270 instances that follow this pattern.

If you ever worked with Uniswap v1, you might know the quirk of these proxies, they always return 40961 bytes disregarding the actual response length.

  • 1 0x1000 in hex

  • CALLDATASIZE
    PUSH1 0x0
    PUSH1 0x0
    CALLDATACOPY
    PUSH2 0x1000
    PUSH1 0x0
    CALLDATASIZE
    PUSH1 0x0
    PUSH20 0x2157a7894439191e520825fe9399ab8655e0f708
    GAS
    DELEGATECALL
    ISZERO
    PC
    JUMPI
    PUSH2 0x1000
    PUSH1 0x0
    RETURN

    EIP-1167 Minimal Proxy Contract

    The most popular immutable proxy was invented in 2018 and got finalized2 the next year. It is so abundant it accounts for 21.6% of all Ethereum contracts.

    The standard also suggests an optimization where you are supposed to mine a shorter logic contract address and replace PUSH20 with PUSH16, saving 4 bytes on every deployment but winning nothing in terms of runtime costs.

    Five years later, out of 13,504,799 instances across 31,618 implementations, only 2 have followed the advice, and only 1 to the letter.

    Optimized EIP-1167

    The first successful attempt is an ERC20 factory3. The optimization seems misplaced here, given the contract allows the user to specify the library address. The user must be aware she can only use a library with an address that starts with at least 4 zero bytes, making the whole concept very error-prone.

    This is the exact fate of our second optimizer, who deployed 125 proxies only to discover that none of them work. Instead of mining a 4-byte zero prefix for the library, he has mined it for the factory, which was a fatal mistake. The full library address, which you can still recover from the factory storage4, got cut off by PUSH16, rendering all proxies irredeemably broken.

    88076ef09c9d720cee4c096765a0581068d97aa9 library address
    000000009c9d720cee4c096765a0581068d97aa9 push16-chewed address

    EIP-1167 in Vyper

    Vyper allows you to create EIP‑1167 proxies natively using create_minimal_proxy_to(address)5.

  • 5 available since Vyper 0.3.4

  • You can deploy a proxy to a deterministic address using CREATE2. The only thing you need to change is to supply a salt6 parameter.

  • 6 available since Vyper 0.3.0

  • In older contracts you might encounter create_forwarder_to(address)7, which does the same and was the first time EIP-1167 appeared in Vyper.

  • 7 available since Vyper 0.2.9

  • 0age shaves one byte off

    Feeling personally challenged by the word “minimal”, 0age came up with a marginal improvement to the contract8, cutting the deployment cost by 200 gas and making each call cheaper by 4 gas.

    It is the second-popular minimal proxy with 9,928 instances or 0.016% of all contracts and 17 unique logic contracts.

    Diff EIP-1167 and The More-Minimal Proxy
    -CALLDATASIZE
     RETURNDATASIZE
     RETURNDATASIZE
    -CALLDATACOPY
     RETURNDATASIZE
     RETURNDATASIZE
    +CALLDATASIZE
     RETURNDATASIZE
    +RETURNDATASIZE
    +CALLDATACOPY
     CALLDATASIZE
     RETURNDATASIZE
     PUSH20 0xbebebebebebebebebebebebebebebebebebebebe
     GAS
     DELEGATECALL
     RETURNDATASIZE
    -DUP3
    +RETURNDATASIZE
    +SWAP4
     DUP1
     RETURNDATACOPY
    -SWAP1
    -RETURNDATASIZE
    -SWAP2
    -PUSH1 0x2b
    +PUSH1 0x2a
     JUMPI
     REVERT
     JUMPDEST

    Vectorized pushes to zero

    Another optimizoor has made a version that utilizes a new PUSH0 opcode. It increases the size back to 45 bytes, but calls become 8 gas cheaper compared to EIP‑1167. You can read the annotated bytecode in the Solady repo9.

  • 9 LibClone.sol in Vectorized/solady

  • Diff EIP-1167 and Solady PUSH0
    +PUSH0
    +PUSH0
     CALLDATASIZE
    -RETURNDATASIZE
    -RETURNDATASIZE
    +PUSH0
    +PUSH0
     CALLDATACOPY
    -RETURNDATASIZE
    -RETURNDATASIZE
    -RETURNDATASIZE
     CALLDATASIZE
    -RETURNDATASIZE
    +PUSH0
     PUSH20 0xbebebebebebebebebebebebebebebebebebebebe
     GAS
     DELEGATECALL
     RETURNDATASIZE
    -DUP3
    -DUP1
    +PUSH0
    +PUSH0
     RETURNDATACOPY
    -SWAP1
    -RETURNDATASIZE
    -SWAP2
    -PUSH1 0x2b
    +PUSH1 0x29
     JUMPI
    +RETURNDATASIZE
    +PUSH0
     REVERT
     JUMPDEST
    +RETURNDATASIZE
    +PUSH0
     RETURN
    PUSH0 requires care

    PUSH0 was added in Shanghai upgrade. The upgrade saw a slow rollout outside Ethereum mainnet. Even if you could deploy a contract, it won’t work until after the network upgrades.

    Is it worth it?

    Optimizing such a tiny contract undoubtedly feels validating. But consider this, every byte or gas squeezed out of it sets you back to zero tool support. If you consider using these variants, be prepared to make pull requests and even bribe Etherscan so they add support.

    Huff developers, who think entirely in bytecode, have chosen to support the standard EIP-1167 proxy in their library10.

  • 10 Clones.huff in huff-lagngauge/huffmate

  • Clones With Immutable Arguments

    This family of proxies abuses the fact that you can use contract’s own code as a cheap write-only storage, reading it with CODECOPY. Immutable variables work in a similar fashion, they must be set in the constructor, which then appends them to the returned runtime bytecode.

    But how do we get the immutables through to the logic contract? Enter wighawag who has invented ClonesWithCallData11. Before forwarding a call, this proxy type reads the extra variables from the code and appends them to the end of calldata, along with a 2 byte extra length. The length is an important security measure that helps delineate where the user-provided calldata ends and immutable variables begin. A function it forwards to will simply ignore everything that comes after its arguments, but the contract itself would be able to read them from msg.data using a tiny helper function that calculates the offset. It’s a win-win situation!

  • 11 ClonesWithCallData.sol in wighawag/clones-with-immutable-args

  • The idea was further iterated upon by boredGenius who has optimized the bytecode and gave it a name of ClonesWithImmutableArgs12. This version has made it into production with 264 instances across 36 logic contracts, while another suggested optimization by Saw-mon & Natalie was ghosted as a forever open pull request13 with only 2 presumably test contracts on mainnet.

  • 12 ClonesWithImmutableArgs.sol in wighawag/clones-with-immutable-args

  • 13 Gas Optimizations

  • 0xSplits uses another older version of these proxies for their VestingModule. 1 of 3 such proxies is used to fund Ethereum core devs. Interestingly, Saw-mon-and-Natalie updates were ported to 0xSplits fork later on, but this version seems to have never made it to prod.

    There exists a Huff rewrite14 of CWIA proxies in huffmate under the name of huff-clones.

  • 14 huff-clones in huff-language/huffmate

  • A compatible Vyper version15 was made by banteg, but creating a Vyper factory would require a new language feature16.

    CWIA proxies are used in production by Ajna, Astaria, Buttonwood, 0xSplits, Sudoswap.

    The difference between CWIA and Sudoswap proxies highlighted with a regex

    Sequence Wallet

    Despite being upgradeable, Sequence proxy is worth mentioning because it achieves immutability for both init and runtime code. It got there by using the address of the proxy as a storage key for the implementation address. Reading from storage makes it the most expensive of all proxies we have looked at today. This pattern is used in their modular and extensible smart wallet17. There are 1888 of Sequence proxies deployed at the time of writing.

  • 17 Wallet.sol in 0xsequence/wallet-contracts

  • PART II: The Data

    In this part you’ll be converted to a parquet maxi capable of regexing through an entire blockchain in seconds. We will look at the most popular patterns and rank logic contracts by number of proxies pointing to them. I’ll take you through preparing and transforming the data needed for a deeper look at the State of the Proxy.

    Choosing the right tool

    I learned of parquet format as recently as this PSA, and it’s as good as it says.

    When I started working on this article I used pdp. You’ll be using cryo, a Rust rewrite of pdp that has recently added support for extracting the same contracts dataset we are interested in.

    Tracing the entire blockchain took 9 hours for me with Erigon. I recommend running rpcdaemon as a separate process as it greatly improves stability.

    Run Erigon with a --metrics option, collect with Prometheus and display in Grafana

    After you tame cryo so it doesn’t overload your Ethereum node when it hammers it with thousands of trace_block requests, you’ll end up with a folder of parquets. You can load them with pandas, but we’ll be using polars.

    It is similar to pandas in goals, but has a different and, frankly, better designed api with no index columns. Polars also allows for lazy processing (achievable with dask in pandas world) and stacking insanely large queries, optimized by its query planner.

    All that will come in handy since our dataset is gonna be massive. A similar but less complete dataset in BigQuery takes up 58.62 GB18. To your surprise, the folder of parquets is significantly smaller, just 15.64 GB. And you have more data than BigQuery already, with additional columns for deployer, factory, init code, and init/runtime code hashes.

  • 18 All sizes in the article are base-10, so 1 GB = 1 billion bytes

  • Parquet dataset size
    from pyarrow.parquet import ParquetFile, ParquetDataset
    
    def parquet_size(path):
        dataset = ParquetDataset(path)
        data = defaultdict(int)
        for file in dataset.files:
            meta = ParquetFile(file).metadata
            data["rows"] += meta.num_rows
            for row_group in range(meta.num_row_groups):
                for col in range(meta.num_columns):
                    item = meta.row_group(row_group).column(col)
                    data["compressed"] += item.total_compressed_size
                    data["uncompressed"] += item.total_uncompressed_size
    
        return pl.DataFrame(data).with_columns(
            (pl.col("uncompressed", "compressed") / 1e9),
            ratio=pl.col.uncompressed / pl.col.compressed,
        )
    parquet_size("~/data/contracts")
    rows compressed uncompressed ratio
    62498184 15.680626 73.873403 4.711126

    This 62.5 million row dataset would’ve been 73.87 GB uncompressed. Parquet offers decent compression out of the box, but we can get better size (2x) and performance (3x) if we consolidate our 18,000 chunks of 1000 blocks per file into 18 chunks of 1 million blocks per file.

    After repartitioning with pyarrow19 (around 4 minutes) and a sprinkle of zstd compression (parquet even supports a different compression strategy per column) we end up with a 7.79 GB dataset. All Ethereum contracts are at your fingertips from this moment on.

  • 19 See also cryogen, if you are interested in keeping your dataset fresh and optimized

  • parquet_size('~/data/contracts_out')
    rows compressed uncompressed ratio
    62498184 7.79446 50.26806 6.449203

    Unique bytecodes

    Out of 62.5 million contracts there are only 1.05 million non-repeating bytecodes. Since we are going to search through them a lot, it would make sense to prepare another parquet that only contains unique codes. We could also add code hashes to use as a join key should we require more info from the main dataset. The number of contracts matching each unique code hash would also come in handy for aggregate statistics.

    Doing this would save us a few seconds on each query. You can additionally pre-encode codes as hex strings if you are mostly going to regex through them. The overhead is manageable thanks to compression, and it’s not that you have much of a choice, binary column support in polars is very limited at the time of writing.

    (
        pl.scan_parquet(contracts_path)
        .select("code", "code_hash")
        .group_by("code_hash")
        .agg(
            pl.col.code.first().bin.encode("hex"),
            num_contracts=pl.count(),
            first_contract=pl.col.contract_address.first().bin.encode("hex"),
        )
        .collect()
        .write_parquet(codes_path)
    )
    parquet_size(codes_path)
    rows compressed uncompressed ratio
    1058734 1.117846 14.339663 12.827943

    Regex the contracts

    With the data prepared, we can search through the bytecodes with a regex and even extract the implementation address from a matching group. But where do we obtain said patterns?

    Lucilky, adding support for different proxies has been a long obsession of mine as an ape contributor20, so I’ve amassed quite a trove of these patterns.

  • 20 See this file and find all the references in this file

  • PROXY_PATTERNS = {
        "EIP-6551": r"^363d3d373d3d3d363d73(.{40})5af43d82803e903d91602b57fd5bf3.+",
        "EIP-1167": r"^363d3d373d3d3d363d73(.{40})5af43d82803e903d91602b57fd5bf3",
        "EIP-1167 Optimized": r"^363d3d373d3d3d363d6f(.{32})5af43d82803e903d91602757fd5bf3",
        "0age": r"^3d3d3d3d363d3d37363d73(.{40})5af43d3d93803e602a57fd5bf3",
        "0xSplits Clones": r"^36603057343d52307f830d2d700a97af574b186c80d40429385d24241565b08a7c559ba283a964d9b160203da23d3df35b3d3d3d3d363d3d37363d73(.{40})5af43d3d93803e605b57fd5bf3",
        "Vyper": r"^366000600037611000600036600073(.{40})5af4602c57600080fd5b6110006000f3",
        "Vyper Beta": r"^366000600037611000600036600073(.{40})5af41558576110006000f3",
        "ClonesWithImmutableArgs": r"^3d3d3d3d363d3d3761.{4}603736393661.{4}013d73(.{40})5af43d3d93803e603557fd5bf3",
        "ClonesWithCallData": r"^363d3d3761.{4}603836393d3d3d3661.{4}013d73(.{40})5af43d82803e903d91603657fd5bf3",
        "sudoswap ClonesWithImmutableArgs": r"^3d3d3d3d363d3d37605160353639366051013d73(.{40})5af43d3d93803e603357fd5bf3",
        "solady ClonesWithImmutableArgs": r"^36602c57343d527f9e4ac34f21c619cefc926c8bd93b54bf5a39c7ab2127a895af1cc0691d7e3dff593da1005b363d3d373d3d3d3d61.{4}806062363936013d73(.{40})5af43d3d93803e606057fd5bf3",
        "sw0nt ClonesWithImmutableArgs": r"^363d3d373d3d3d3d61.{4}806035363936013d73(.{40})5af43d3d93803e603357fd5bf3",
        "0xSplits ClonesWithImmutableArgs": r"^36602f57343d527f9e4ac34f21c619cefc926c8bd93b54bf5a39c7ab2127a895af1cc0691d7e3dff60203da13d3df35b3d3d3d3d363d3d3761.{4}606736393661.{4}013d73(.{40})5af43d3d93803e606557fd5bf3",
        "Solady PUSH0": r"^5f5f365f5f37365f73(.{40})5af43d5f5f3e6029573d5ffd5b3d5ff3",
        "Sequence": r"^363d3d373d3d3d363d30545af43d82803e903d91601857fd5bf3",
    }

    I’ve put matching groups around where the implementation addresses are present in bytecodes. They mostly come after PUSH20, with the exceptions of PUSH16 in optimized EIP-1167 and no address in Sequence proxy, where you would need to look for it in the contract storage.

    This allows us to extract hardcoded implementation addresses for most types of proxies.

    def search_codes(pattern):
        return (
            pl.scan_parquet(codes_path)
            .filter(pl.col.code.str.contains(pattern))
            .with_columns(extracted=pl.col.code.str.extract(pattern, 1))
            .collect()
        )

    To make queries faster, we can load a partial dataset into memory. Let’s assume that a minimal proxy is no larger than 512 bytes, including the data section.

    small_codes = (
        pl.scan_parquet(codes_path)
        .filter(pl.col.code.str.lengths() // 2 <= 512)
        .collect()
    )
    
    def search_codes(pattern):
        return (
            small_codes
            .filter(pl.col.code.str.contains(pattern))
            .with_columns(extracted=pl.col.code.str.extract(pattern, 1))
        )

    EIP-1167 plus?

    Let’s put it to the test and collect statistics about the most common minimal proxy.

    minimal_proxies = search_codes(PROXY_PATTERNS['EIP-1167'])
    minimal_proxies.select(
        num_contracts=pl.col.num_contracts.sum(),
        num_codes=pl.count(),
        num_impls=pl.col.extracted.unique().count(),
    )
    num_contracts num_codes num_impls
    13504799 37349 31618

    Observe the discrepancy between the number of implementations and the number of contracts. Wonder where it comes from. Could it be some extra data appended at the end? Let’s group them by code size.

    (
        minimal_proxies.group_by((pl.col.code.str.lengths() // 2).alias("code_size"))
        .agg(
            pl.col.num_contracts.sum(),
            num_codes=pl.count(),
            num_impls=pl.col.extracted.unique().count(),
        )
        .sort("num_contracts", descending=True)
    )
    code_size num_contracts num_codes num_impls
    45 13499061 31611 31611
    173 5738 5738 7

    There is exactly one alternative flavor of these proxies with data at the end. After looking at sample contracts, they appear to be related to EIP-655121.

    Since there is just 7 implementations, let’s look at the number of proxies pointing to each of them.

    eip_6551 = minimal_proxies.filter(pl.col.code.str.lengths() // 2 > 45)
    (
        eip_6551
        .select(pl.col.extracted.value_counts(sort=True))
        .unnest('extracted')
    )
    extracted counts
    "2d25602551487c3f3354dd80d76d54383a243358" 3535
    "5ae5f4d4982ede2c820d2a9827ccb97fed6cef71" 2174
    "d00431d1bfe2f0f85396e685d890e18f8dc411aa" 24
    "ed4e5b7338e8b4b5299c69b5a7c0f82c4426439e" 2
    "811fa807cf230e32211ed168920fe8b27d01cb1c" 1
    "8ee9a60cb5c0e7db414031856cb9e0f1f05988d1" 1
    "dc4f50191a6f250741d9aa733dd4299499285314" 1

    After reading the EIP, we can decode the data section and look at the most popular nfts that offer token bound accounts.

    import eth_abi
    
    def eip6551_token(code):
        types = ["uint256", "uint256", "address", "uint256"]
        names = ["salt", "chain_id", "token_contract", "token_id"]
        result = dict(zip(names, eth_abi.decode(types, bytes.fromhex(code)[45:])))
        return result["token_contract"]
    
    
    eip_6551_tokens = (
        eip_6551.with_columns(token_contract=pl.col.code.map_elements(eip6551_token))
        .select(pl.col.token_contract.value_counts(sort=True))
        .unnest("token_contract")
    )
    print(len(eip_6551_tokens), "tokens")
    eip_6551_tokens.head(10)
    825 tokens
    token_contract counts
    "0xa87ea7c8745980490bcdcff97fe7328535098cd1" 2174
    "0x8ddef0396d4b61fcbb0e4a821dfac52c011f79da" 835
    "0x26727ed4f5ba61d3772d1575bca011ae3aef5d36" 487
    "0x8c34e6e60731d1ff7e26c712ea1f798f90f29ec6" 433
    "0x235c939ae3859f8041cee3ccb2c58a7983bf02b1" 105
    "0xc1341a63a4dd85443b0413bc2acffc49b60e9c70" 91
    "0xd4307e0acd12cf46fd6cf93bc264f5d5d1598792" 36
    "0x57f1887a8bf19b14fc0df6fd9b2acc9af147ea85" 35
    "0x9131d8c7a411d90c6b164d296440701a0e5b3178" 35
    "0xbc886e22683680026c9028ef86667dcad36e90e3" 34

    Unknown variants

    Finding a new proxy type every time I pick up this article has become a bit of a running joke. To put it to bed, we need to come up with a definitive way to find all possible variants. Let’s start from an assumption that all minimal proxies contain this code.

    PUSH_ <implementation> GAS DELEGATECALL

    As I’ve shown in my other article22, you can’t simply regex in this case, because parts of the code could have different semantic meaning. We need to disassemble the code so we don’t accidentally match PUSH arguments or data.

    small_codes.select(pl.col.num_contracts.sum(), num_codes=pl.count())
    num_contracts num_codes
    49481339 138650

    We keep track of the last push instruction while looking for a delegatecall pattern we identified. The push argument would contain the implementation address.

    If you are interested in a more advanced version of this algorithm, check out WhatsABI23.

  • 23 disasm.ts in shazow/whatsabi

  • from collections import deque
    from ethereum.shanghai.vm.instructions import Ops
    
    opcodes = {op.value: op for op in Ops}
    
    class PUSH:
        def __eq__(self, op):
            if isinstance(op, Ops):
                op = op.value
            return Ops.PUSH1.value <= op <= Ops.PUSH32.value
    
    def disasm(code):
        """Disassemble the code and extract the implementation address."""
        code = bytes.fromhex(code)
        ops = deque(maxlen=3)
        pushes = deque(maxlen=1)
        pc = 0
        while pc < len(code):
            op: int = code[pc]
            ops.append(Ops(op))
            if Ops.PUSH1.value <= op <= Ops.PUSH32.value:
                push_size = op - Ops.PUSH0.value
                push_value = code[pc + 1 : pc + 1 + push_size]
                pushes.append(push_value.hex())
                pc += push_size
    
            if list(ops) == [PUSH(), Ops.GAS, Ops.DELEGATECALL]:
                return {"impl": pushes[-1], "push_size": ops[0].value - Ops.PUSH0.value}
    
            pc += 1
    
        return {"impl": None, "push_size": None}

    We can apply a python function to a column using map_elements. The returned dict becomes a struct, which can be turned into columns using unnest.

    Addresses that come from opcodes other than PUSH20 need to be normalized. Check longer pushes for empty padding bytes, and pad shorter ones with zeros.

    found_proxies = (
        small_codes
        .with_columns(impl=pl.col.code.map_elements(disasm))
        .unnest('impl')
        .filter(pl.col.impl.is_not_null())
        .with_columns(pl.col.impl.str.rjust(64, '0'))
        .filter(pl.col.impl.str.slice(0, 24) == '0' * 24)
        .with_columns(pl.col.impl.str.slice(24, 40))
    )

    We have found more proxies and improved our results by 8.5%.

    found_proxies.select(
        num_contracts=pl.col.num_contracts.sum(),
        num_codes=pl.count(),
        num_impls=pl.col.impl.unique().count(),
    )
    num_contracts num_codes num_impls
    14675709 47064 31962
    (
        found_proxies.select(pl.col.num_contracts.sum())
        / proxies_by_type.select(pl.col.num_contracts.sum())
    )
    num_contracts
    1.085049

    We have also collected interesting statistics about vanity addresses.

    found_proxies.select(pl.col.push_size.value_counts(sort=True)).unnest('push_size')
    push_size counts
    20 46926
    16 116
    32 9
    18 6
    15 4
    17 1
    13 1
    14 1

    Up to you to verify if someone has actually managed to mine an address with 7 leading zero bytes, I will just provide a list of suspect addresses for convenince.

    (
        found_proxies.filter(pl.col.push_size < 16)
        .select("impl", "first_contract", "push_size")
        .sort("push_size")
    )
    impl first_contract push_size
    "000000000000001d48ffbd0c0da7c129137a9c55" "0000000000bf2686748e1c0255036e7617e7e8a5" 13
    "000000000000df8c944e775bde7af50300999283" "00000000005df2c9274605abd6773f14578f8a28" 14
    "00000000004096437c84e1b0927d5ed44f45f6b3" "2ea66a667e258594a05c94345775a02c5fb1b74c" 15
    "00000000008eabbe9a46fa87f0d1e41e62a96d50" "a9a90922d6d2a5c3a36c0170bb0cefac5300b779" 15
    "0000000000c08718718b974d644b098c19bd0064" "8608a089a9ec9a5483faa12a56e987901aad27ce" 15
    "000000000045ef846ac1cb7fa62ca926d5701512" "f4907e6182157d564dc180978499b48ebd333f77" 15

    Seeing patterns

    As you could’ve observed from regex patterns, all codes end with an f3 (RETURN) instruction. We can exploit this fact to strip the data section. Moreover, if we also strip the PUSH arguments, we end up with what I call a mini code that uniquely identifies a specific proxy pattern.

    def strip_code(code):
        """Distill the code to its minimal form."""
        code = bytes.fromhex(code)
        mini_code = []
        pc = 0
        while pc < len(code):
            op = code[pc]
            mini_code.append(op)
            if op == PUSH():
                push_size = op - Ops.PUSH0.value
                pc += push_size
            
            if op == Ops.RETURN.value:
                break
            
            pc += 1
        
        return bytes(mini_code).hex()
    
    strip_code(
        '363d3d373d3d3d363d73'
        'bebebebebebebebebebebebebebebebebebebebe'  # implementation
        '5af43d82803e903d91602b57fd5bf3'
        'b00ba1'  # data section
    )
    '363d3d373d3d3d363d735af43d82803e903d916057fd5bf3'

    Combine it all together and we find 80 unique delegatecall patterns, including the obscure ones and the ones that encode immutable args.

    known_names = (
        base_proxies.unique("proxy_type")
        .filter(pl.col.proxy_type != "EIP-6551")
        .with_columns(mini_code=pl.col.code.map_elements(strip_code))
        .select("mini_code", "proxy_type")
    )
    proxy_types = (
        found_proxies.with_columns(mini_code=pl.col.code.map_elements(strip_code))
        .group_by("mini_code")
        .agg(
            num_codes=pl.count(),
            num_impls=pl.col.impl.unique().count(),
            num_contracts=pl.col.num_contracts.sum(),
            first_contract=pl.col.first_contract.first(),
        )
        .join(known_names, on="mini_code", how="left")
        .sort("num_contracts", descending=True)
    )
    proxy_types
    mini_code num_codes num_impls num_contracts first_contract proxy_type
    "363d3d373d3d3d363d735af43d82803e903d916057fd5bf3" 37349 31618 13504799 "e5f740036228a299f7498c82be966f1ad750bf94" "EIP-1167"
    "583681803780803681735af43d91908282803e6057fd5bf3" 14 14 955047 "40d87fa19a6d1055d42a08e1f79ccefdae1a1ce1" null
    "60605234615760513660823760803683735af43d8060843e81801561578184f3" 7 7 79582 "40d7d02529045b7ec1d443c540d9522d052164ad" null
    "3d3d3d3d363d3d37606036393660013d735af43d3d93803e6057fd5bf3" 8607 29 40689 "e6216d1144ac51b33efa326672d4aafc08be93a1" "sudoswap ClonesWithImmutableArgs"
    "606052366057005b3660803760603660735af45000fea264221220cd31d7df5797d8b… 1 1 37233 "06c683bf06e0e8c2709f6eb90451b06f2977b59d" null
    "60808080803680928037735af43d828181803e8083146057f3" 20 20 12782 "2f4b849bdac28903fd7bb142cdbc6005c770ad2d" null
    "3d3d3d3d363d3d37363d735af43d3d93803e6057fd5bf3" 17 17 9928 "cdff0b2aab0f203860be730a077ca7213fecaa8c" "0age"
    "6060523660811461573660803760803660735af43d60803e8060811461573d60f3" 6 4 5965 "25dbb91164bf15bd16df3f5a63a062d6aebdabfb" null
    "3660603761603660735af41558576160f3" 21 21 4518 "a698d462dc0f38844ca4e72dfcce372b982fa67b" "Vyper Beta"
    "603681803780803681735af43d82803e1560573d90f3" 23 23 4457 "3fb109e7df20e60d55ca38788d85c1e243f82931" null
    "6060523660803760803660735af43d60803e8060811460573d60f3" 4 3 3189 "f38eb9e060fc7104908db80968b3e2a7a226b07f" null
    "366057343d52307f603da23d3df3" 1 1 2890 "f8843981e7846945960f53243ca2fd42a579f719" "0xSplits Clones"
    "6060523660803760803660735af43d60803e80801560573d60f3" 1 1 2779 "728100299558e2d5790e1a7f9ea12de33f534794" null
    "6080368181838037735af400" 2 2 2701 "7bf6a2c08923e69a503031b56df35e98a98a5925" null
    "606052366057005b3660803760603660735af45000fea264221220fec0577d6f" 1 1 1663 "276628e5af6981f324abe82823985f123a3b0202" null
    "60605234615760513660823760803683735af43d8060843e8160811461578184f3" 3 2 1591 "2cf63d2f74890509a2bd42c9011e2dffe0928dd0" null
    "6060526036101560575b6036818037808036817f5af43d82803e1560573d90f3" 2 2 908 "a3bad5098f9489f536342ca9957bbc808d9d5d96" null
    "61605a031015605734156057346052606060606073605a03f11560576060f3" 1 1 717 "319e17a9c9153737fee4d5b5a0b76ad71732ece8" null
    "3660803761603660735af43d60803e8060811460573d60f3" 14 14 699 "8dd410227d3909b35f662825555443b6da7c0007" null
    "3660573415605734605233606080a25b005b608052606060376360511415605773605… 1 1 631 "34c22a846ab26d9e590ccad985c4e9b6d96fcbfa" null
    "3660603761603660735af460576080fd5b6160f3" 35 35 481 "fafdab9e3292612f8d07727a6375103b7b338fc0" "Vyper"
    "363d3d37616036393d3d3d3661013d735af43d82803e903d916057fd5bf3" 326 21 336 "7d7d26d129e793e2a4c0b96b17e18f2926a8a8e1" "ClonesWithCallData"
    "3d3d3d3d363d3d37616036393661013d735af43d3d93803e6057fd5bf3" 262 36 264 "5ed82249fa6e91becb35ef03044e22eb2dbde912" "ClonesWithImmutableArgs"
    "3660803760803660735af43d60803e603d916057fd5bf3" 4 4 254 "54f85a9da2d3c5f0059dc21f016d5c4f9d14df8c" null
    "3d3d3d3d3d735af48180f3" 1 1 206 "9946bb3e71bf3b94278fce25053d0803da6150f7" null
    "3d3d3d363d3d373d3d363d735af46057fd5b3d913e3d90f3" 2 2 200 "f57113d8f6ff35747737f026fe0b37d4d7f42777" null
    "61605a031015605734156057346052606060606073605a03f1506060f3" 1 1 162 "ef9fe029a6134ef5c08a1f4c880efa0af8c3b856" null
    "363d3d373d3d3d363d6f5af43d82803e903d916057fd5bf3" 2 2 131 "a81043fd06d57d140f6ad8c2913dbe87fdecdd5f" "EIP-1167 Optimized"
    "6060526036101560575b5f36818037808036817f5af43d82803e1560573d90f3" 2 2 118 "551485188cded3562290b928e34e963fc2234bf8" null
    "3d3d3d3d363d3d37363d6f5af43d3d93803e6057fd5bf3" 113 1 113 "f3f6373a79021e3610e9e3a97cf8fd58e7ca043e" null
    "60605260513660823760803683735af43d8060843e8160811460578184f3" 2 2 100 "900de34640ad9be267e4255f66fd7888c9353be9" null
    "606052603660603780603660735af480801560578260f3" 6 5 90 "68ee9c1c7af2836b1bbebb4df65d39c54ef21458" null
    "366057343d527f593da1005b363d3d373d3d3d3d618060363936013d735af43d3d938… 78 14 78 "7c340d736976c995db51534fa82376d78cbdbb6f" "solady ClonesWithImmutableArgs"
    "6060526036106157637c60350416638114615780631461575b3660803760803660735… 1 1 56 "00224d9084fd7cdb4a5ce7740cb1ca0dca6be7a6" null
    "363d3d373d3d3d363d715af43d82803e903d916057fd5bf3" 6 6 54 "927fba49239b3c163ff14464ed6c57444836f3a1" null
    "363d3d373d3d3d3d608038038091363936013d735af43d3d93803e6057fd5bf3" 44 5 44 "ab67e433c596dd1382b5756d8c7e4f0d69fb1bc7" null
    "366057343d527f593da1005b3d3d3d3d363d3d37363d735af43d3d93803e6057fd5bf… 2 2 38 "693c49a6296d90e8a8936ad4836a680f551bb97d" null
    "363d3d373d3d3d363d705af43d82803e903d916057fd5bf3" 1 1 25 "bd21981556da7d96ee6cce209f0ac292490065cc" null
    "6060526036818037808036817f5af43d82803e1560573d90f3" 4 2 21 "b85df5ee83eadb92dbc32f13403372c8f880cf02" null
    "6060523373146081146057606080376060f3" 2 2 18 "da07dddec856e9bcdfdcc428f577db94d71759c0" null
    "6036818037808036816f5af41560573d81803e3d81f3" 1 1 18 "9fb5fa690a9e77d7195853283cd6f36d77d8c1ae" null
    "603681803780803681735af41560573d81803e3d90f3" 2 2 18 "c54a597a7c2cf5ed6867095cb69a42eac21fc7a0" null
    "36608037608036606e5af43d60803e603d916057fd5bf3" 1 1 16 "f4907e6182157d564dc180978499b48ebd333f77" null
    "3d3d3d3d363d3d37606036396036013d735af43d928390803e6057fd5bf3" 15 2 15 "e6e1dd9aae13bd8b19853bd15875162ff4fdf553" null
    "5a61116357608080807f5af43d60803e6014156357605160141563575b005b6036106… 1 1 11 "f1718480cd208273e6a317746a6cd8c138ec9a82" null
    "363d3d373d3d3d363d6e5af43d82803e903d916057fd5bf3" 3 3 6 "2ea66a667e258594a05c94345775a02c5fb1b74c" null
    "3d603d52363d59377359523d3d6080590390735af4813d3d82803e91156157f3" 1 1 5 "613359585a176284a5280b072e1d6dc00320c385" null
    "5f5f365f5f37365f735af43d5f5f3e60573d5ffd5b3d5ff3" 5 5 5 "39a0f5da85b4a08686b6e931860e5192deb685f0" "Solady PUSH0"
    "3d3d3d363d3d37608038038091363936013d735af43d82803e3d82826057fd5bf3" 4 2 4 "4498d2f48eb2d7473c30ec3b010b8c4ce69417fb" null
    "60605234801560576080fd5b50603681823780813683735af43d82833e80801560573… 2 1 4 "0e946e9e1089e10992321e1e8a5aa2b9162a5bab" null
    "733314151561573660603760603660735af4156157005bfe5b603580601a604316031… 3 2 4 "66f049111958809841bbe4b81c034da2d953aa0c" null
    "3d3d3d3d363d3d37363d6d5af43d3d93803e6057fd5bf3" 1 1 4 "00000000005df2c9274605abd6773f14578f8a28" null
    "366057343d527f603da13d3df3" 3 1 3 "f29ff96aaea6c9a1fba851f74737f3c069d4f1a9" "0xSplits ClonesWithImmutableArgs"
    "606052603681823780813683735af43d82833e80801560573d83f3" 3 3 3 "2d17e6d1b4ca0d721e48a283793b789de85b8403" null
    "5f5f365f5f37365f735af43d5f5f3e603d5ffd5b3d5ff3" 2 1 2 "5460e1f077e08457f899e7dabb71b2f871f8be23" null
    "60606060735af4" 1 1 2 "beaca7ad81ae2ef8403ee3593635d108f1856331" null
    "5f5f365f5f735af43d5f5f3e60573d5ffd5b3d5ff3" 2 2 2 "b94921e26618c9f0b56e9171b498d2ba471c49ce" null
    "3660803760803660735af43d60803e60573d60fd5b3d60f3" 2 2 2 "60055deba04dcc173515f9398337987655fea808" null
    "6080356381601c14156157815433811461578283fd5b6035603581818383131561575… 1 1 2 "3f9e41b2b4548ccb7b6073c5278738fc59e14183" null
    "363d3d373d3d3d3d618060363936013d735af43d3d93803e6057fd5bf3" 2 1 2 "ae4c7ddef4360d0cb5156b2c1435761525ff3997" "sw0nt ClonesWithImmutableArgs"
    "3660603760803660735af480801560573d8060603e60f3" 1 1 2 "3b1eb95182f769cea6b8808c4da448094756a2ca" null
    "603681823780813683735af4151560573d81823e3d81fd5b3d81823e3d81f3" 1 1 2 "e36d76da7431d38d1958ef0367785030d523fa17" null
    "60605234801560576080fd5b5060513660823760803683735af43d8060843e8160811… 1 1 2 "2769f4170a9b35ccdb946dc0ff853792f9dbb1ae" null
    "3660603760803660735af41560573d8060603e60f3" 1 1 2 "1a1a913f2334c7424efb093413ddc3af014175eb" null
    "6060523660576f5f525f805f80735af460575f80fd5b005b5f80fdfea2642212207e6… 1 1 1 "fca8d0ac3eb28a7e3bef2e313b5ddf5c83076e0a" null
    "6060523360526060207f8114615763541561577f60527c60527c60526060526060fd5… 1 1 1 "09afae029d38b76a330a1bdee84f6e03a4979359" null
    "3373146157005b3661575f5f5f5f47335af1005b7f5f52475f5b8035601c60525f5f6… 1 1 1 "878715422f2fc2d939f8d5a4e77eecd80158aa4b" null
    "3d3d343d146057f3" 1 1 1 "4e4f343e93ad73d7098af84418023b5677917582" null
    "6060523660576060565b005b6060565b005b3660803760803660735af43d60803e806… 1 1 1 "6b741aa01f060fcd702ff0baade110372f219d39" null
    "60605234801561576080fd5b5060361061576035601c80631461575b3660803760803… 1 1 1 "335157e30f88ffe30f9f03cde78c44265922895d" null
    "60548060575b3360553660555b80358155600136811060576060603933606034f5806… 1 1 1 "2319c999808d589b5fb33349b182f000d281dc65" null
    "733314151561573660603760603660735af450005b60356181601c61565b6035601c8… 1 1 1 "9d68689a0568caed355a2f22dbde370bc33e671f" null
    "60605236801561573660803760803660735af43d60803e80801561573d60f3" 1 1 1 "48babd296989592346e8b5cb904895f808e75e61" null
    "733314151561573660603760603660735af4156157005bfe5b60356181601c61565b6… 1 1 1 "12e98c9b9b266255bf037ccafcf87657e9a0b31b" null
    "365f5f375f5f365f6c5af43d5f5f3e5f3d916057fd5bf3" 1 1 1 "0000000000bf2686748e1c0255036e7617e7e8a5" null
    "363d3d373d3d3d363d735af43d82803e903d9160575b61565b6060606047735af100" 1 1 1 "1aa6c2224d19457fdf7d7143243267859b05c1cb" null
    "3d6080603d3981f3" 1 1 1 "d9244c769512d951eb8939075d1857011aad9248" null
    "6060523660577f5f525f80605f735af460575f80fd5b005b5f80fdfea264221220d8e… 1 1 1 "7631961ad277714cce7567f2303e88d881d13b5a" null
    "60605234151561573660803760803660735af43d60803e8060811461573d60f3" 1 1 1 "b22a3f3d4b7378275de40d2d840d9f4678dada81" null
    "603681823780813683735af43d82833e8060811460573d83f3" 1 1 1 "a009c7e3718740f56294914b3f96133f338df0b5" null

    It’s left as an exercise for the reader to see if I’ve missed anything, but the code certainly hasn’t.

    Hope you learned something interesting today.