Calldata, Memory & Storage

Calldata

The calldata is a read-only byte-addressable space where the data parameter of a transaction or call is held. Unlike the stack, to use this data you have to specify an exact byte offset and number of bytes you want to read.

The opcodes provided by the EVM to operate with the calldata include:

  • CALLDATASIZE tells the size of the transaction data.

  • CALLDATALOAD loads 32 bytes of the transaction data onto the stack.

  • CALLDATACOPY copies a number of bytes of the transaction data to memory.

Memory

  • Memory is a volatile read-write byte-addressable space. It is mainly used to store data during execution, mostly for passing arguments to internal functions. Given this is volatile area, every message call starts with a cleared memory. All locations are initially defined as zero. As calldata, memory can be addressed at byte level, but can only read 32-byte words at a time.

  • Memory is said to “expand” when we write to a word in it that was not previously used. Additionally to the cost of the write itself, there is a cost to this expansion, which increases linearly for the first 724 bytes and quadratically after that.

  • The EVM provides three opcodes to interact with the memory area:

    • MLOAD loads a word from memory into the stack.

    • MSTORE saves a word to memory.

    • MSTORE8 saves a byte to memory.

Free Memory Pointer

The free memory pointer is simply a pointer to the location where free memory starts. It ensures smart contracts keep track of which memory locations have been written to and which haven’t.

Solidity’s memory layout reserves four 32-byte slots:

  • 0x00 - 0x3f (64 bytes): scratch space

  • 0x40 - 0x5f (32 bytes): free memory pointer

  • 0x60 - 0x7f (32 bytes): zero slot

Definitions:

  • Scratch space, can be used between statements i.e. within inline assembly and for hashing methods.

  • Free memory pointer, currently allocated memory size, start location of free memory, 0x80 initially.

  • The zero slot, is used as an initial value for dynamic memory arrays and should never be written to.

Storage

Contract storage is simply a key to value mapping. It maps a 32-byte key to a 32-byte value. Given our key is 32 bytes in size we can have a maximum of (2^256)-1 keys.

Setting a storage value to zero refunds you some gas as that key value no longer needs to be stored by the nodes on the network.

Contract variables that are declared as storage variables can be split into 2 camps: fixed-size and dynamic-size.

Fixed-Sized Variables

For fixed-sized variables, the EVM can use reserved storage locations (keys) starting from slot 0 (key of binary value 0) and moving linearly forward to slot 1, 2 etc. It does this based on the order the variables are declared in the contract. The first declared storage variable will be stored at slot 0.

Dynamically-Sized Variables

For dynamically-sized arrays and mappings, there is no way of knowing how many slots to reserve. One could technically chose 1 out of 2^256 locations at random to store variables, however this would make it challenging to find the data again.

Hence Solidity uses a hash function to uniformly and repeatably compute locations for dynamically-sized values.

Arrays

A dynamically-sized array needs a place to store its size as well as its elements.

contract StorageTest {
    uint256 a;     // slot 0
    uint256[2] b;  // slots 1-2

    struct Entry {
        uint256 id;
        uint256 value;
    }
    Entry c;       // slots 3-4
    Entry[] d;
}

In the above code, the dynamically-sized array d is at slot 5, but the only thing that’s stored there is the size of d. The values in the array are stored consecutively starting at the hash of the slot.

The following Solidity function computes the location of an element of a dynamically-sized array:

function arrLocation(uint256 slot, uint256 index, uint256 elementSize)
    public
    pure
    returns (uint256)
{
    return uint256(keccak256(slot)) + (index * elementSize);
}

Mappings

A mapping requires an efficient way to find the location corresponding to a given key. Hashing the key is a good start, but care must be taken to make sure different mappings generate different locations.

contract StorageTest {
    uint256 a;     // slot 0
    uint256[2] b;  // slots 1-2

    struct Entry {
        uint256 id;
        uint256 value;
    }
    Entry c;       // slots 3-4
    Entry[] d;     // slot 5 for length, keccak256(5)+ for data

    mapping(uint256 => uint256) e;
    mapping(uint256 => uint256) f;
}

In the above code, the “location” for e is slot 6, and the location for f is slot 7, but nothing is actually stored at those locations. To find the location of a specific value within a mapping, the key and the mapping’s slot are hashed together.

The following Solidity function computes the location of a value:

function mapLocation(uint256 slot, uint256 key) public pure returns (uint256) {
    return uint256(keccak256(key, slot));
}

Calldata vs. memory vs. storage

Variables are declared as either storage, memory or calldata to explicitly specify the location of the data.

  • storage - variable is a state variable (stored on blockchain)

  • memory - variable is in memory and it exists while a function is being called. mutable during the lifespan of a function call.

  • calldata - special data location that contains function arguments; immutable.

References

https://noxx.substack.com/p/evm-deep-dives-the-path-to-shadowy-3ea

https://programtheblockchain.com/posts/2018/03/09/understanding-ethereum-smart-contract-storage/

Last updated