Zellic - Audit Reports

Category: Code Maturity

Endianess for `Bytes32`

Informational Impact

Informational Severity

N/A Likelihood

Description

The Bytes32 type (sdk/api_bytes32.go) is intended as an in-circuit representation of the similarly named Solidity type, as the documentation comment states:

// Bytes32 is an in-circuit representation of the solidity bytes32 type.
type Bytes32 struct {
	Val [2]frontend.Variable
}

A Solidity bytes32 type is a sequence of 32 bytes. For storage and the stack, Ethereum operates with 32-byte words, so one bytes32 is exactly one such word. Only memory is addressed bytewise. When storing a bytes32 in memory, the first byte is stored at the lowest address. The following Solidity example demonstrates this:

pragma solidity ^0.8.0;

contract Example {
    function demonstrateBytes32InMemory() public pure returns (string memory) {
        // Initialize bytes32 with a constant example string
        bytes32 b32 = "Hello world!";

        // Overwrite the second byte (lowest address + 1) with a new value, say 'X'
        // 'X' ASCII value is 88
        assembly {
            mstore(mload(0x40), b32) // Store b32 in memory
            mstore8(add(mload(0x40), 1), 88) // Store 0x58 (ASCII 'X') at the second byte position
            b32 := mload(mload(0x40)) // Load b32 back from memory
        }

        // Return the bytes32 as a string
        // Create a new bytes memory representation for string conversion
        bytes memory result = new bytes(32);
        assembly {
            // Copy the bytes32 value into the result bytes array
            mstore(add(result, 32), b32) // Store the bytes32 value in the new bytes array
        }

        return string(result); // Convert bytes to string and return
    }
}

When a type consisting of successive bytes or bits such as bytes32 gets interpreted as an unsigned integer, endianess needs to be taken into account; one may interpret the first bytes/bit as the least significant (little endian) or most significant (big endian). Ethereum uses the big-endian convention on such conversions, as described in the first sentence of Appendix H of the yellow paper↗.

Let us return to the Bytes32 type that is part of the Brevis sdk. Internally, it stores its data in two circuit variables Val[0] and Val[1]. The reason two circuit variables must be used is that the finite field of prime order over which the circuit is defined has only order r, where r is a 254-bit prime, and so insufficiently large for 32 bytes (so 256 bits) of data. The most natural expectation would be that Val[0] will encode bytes 0 through k for some k, and Val[1] will encode bytes k+1 through 31.

Let us consider the toBinaryVars function, which converts Bytes32 to 256 circuit variables representing the bits making up the 32-byte--long bytestring.

// toBinaryVars defines the circuit that decomposes the Variables into little endian bits
func (v Bytes32) toBinaryVars(api frontend.API) []frontend.Variable {
	var bits []frontend.Variable
	bits = append(bits, api.ToBinary(v.Val[0], numBitsPerVar)...)
	bits = append(bits, api.ToBinary(v.Val[1], 32*8-numBitsPerVar)...)
	return bits
}

This implementation fits with the interpretation of Val[0] and Val[1] just given; the first numBitsPerVar bits, which should correspond to the first numBitsPerVar / 8 bytes, are stored in Val[0], with the remaining bytes stored in Val[1]. The api.ToBinary function decomposes field elements into little-endian bits. Thus, we come to conclusion that Bytes32 is stored by dividing the bytes up into the first numBitsPerVar / 8 bytes and the remainder, with the former being stored in Val[0] by interpreting the bytes as little-endian representation of an unsigned integer to base 256, and similarly, the latter is being stored in Val[1].

The FromBinary fits with this interpretation:

func (api *Bytes32API) FromBinary(vs ...Uint248) Bytes32 {
	var list List[Uint248] = vs
	values := list.Values()
	for i := len(vs); i < 256; i++ {
		values = append(values, 0)
	}
	res := Bytes32{}
	res.Val[0] = api.g.FromBinary(values[:numBitsPerVar]...)
	res.Val[1] = api.g.FromBinary(values[numBitsPerVar:]...)
	return res
}

However, ConstBytes32 functions differently:

// ConstBytes32 initializes a constant Bytes32 circuit variable. Panics if the
// length of the supplied data bytes is larger than 32.
func ConstBytes32(data []byte) Bytes32 {
	if len(data) > 32 {
		panic(fmt.Errorf("ConstBytes32 called with data of length %d", len(data)))
	}

	bits := decomposeBits(new(big.Int).SetBytes(data), 256)

	lo := recompose(bits[:numBitsPerVar], 1)
	hi := recompose(bits[numBitsPerVar:], 1)

	return Bytes32{[2]frontend.Variable{lo, hi}}
}

This function is passed a slice of bytes data. Based on what was discussed before regarding the other functions, we would expect that data[0] through data[(numBitsPerVar / 8) - 1] are stored in Val[0] and the remaining bytes in Val[1].

Instead, the function first uses new(big.Int).SetBytes(data) to obtain a big.Int from data. This will interpret data in big endian. So data[0] will be the most significant byte of the resulting integer. This integer is then converted to bits with decomposeBits, which will order bits with little endian. Thus, now data[0], as the most significant byte, will occur as bits 248 to 255. Finally, the bits are recomposed (using little-endian interpretation again) into two values. The value lo, used for Val[0], will consist of the first numBitsPerVar bits, which will thus correspond to bytes byte[31-0] to data[31-((numBitsPerVar / 8) - 1)]. So the least significant eight bits of Val[0] will be data[31], the next least significant eight bits will be data[30], and so on, up to the most significant eight bits data[31-((numBitsPerVar / 8) - 1)]. The second field Val[1] will have as the least significant eight bits the byte data[31-(numBitsPerVar / 8)].

This way that ConstBytes32 handles its argument does thus not fit a compatible interpretation of the Bytes32 data type that also incorporates the other functions; the order of the bytes is reversed by ConstBytes32.

It would be instructive to also look at the following function, SlotOfArrayElement, from sdk/circuit_api.go:

// SlotOfArrayElement computes the storage slot for an element in a solidity
// array state variable. arrSlot is the plain slot of the array variable.
// index determines the array index. offset determines the
// offset (in terms of bytes32) within each array element.
func (api *CircuitAPI) SlotOfArrayElement(arrSlot Bytes32, elementSize int, index, offset Uint248) Bytes32 {
	//api.Uint248.AssertIsLessOrEqual(offset, ConstUint248(elementSize))
	o := api.g.Mul(index.Val, elementSize)
	return Bytes32{Val: [2]variable{
		api.g.Add(arrSlot.Val[0], o, offset.Val),
		arrSlot.Val[1],
	}}
}

Here, slots in storage are addressed with Bytes32. By the Ethereum specification, these should be interpreted as big endian to convert to unsigned integers' indexing slots, in order to add the offset. However, the function adds the (assumed small) offset to Val[0], which suggests that the slot address is actually stored in little endian in the Bytes32 given as argument and will be similarly for the return value.

This is compatible with ConstBytes32, if a []byte input is obtained by converting the address to 32 bytes using Ethereum's big-endian standard and then converted to Bytes32 using ConstBytes32, which thus flips the order of the bytes. The return value of SlotOfArrayElement could then be compared against similar addresses also obtained in flipped representation using ConstBytes32. Both ConstBytes32 and SlotOfArrayElement are implemented with a surprising reversion of the order of the bytes, but these cancel each other out so that they are compatible with each other.

Impact

The interface for Bytes32 and its use is confusing and inconsistent regarding how the type is to be interpreted and in which orders the bytes are stored. This can cause mistakes when users of the sdk use this type.

The root cause of this is that the Bytes32 is in SlotOfArrayElement and ConstBytes32 used as if it were a byte of 256-bit unsigned integers. Endianess questions arise whenever one converts between a type consisting of a list of values (such as a list of bytes) and a type for numeric values. Using the same type with both interpretations makes the need for such conversions particularly confusing. The Bytes32 type should thus not be used like this; for 256-bit unsigned integers, a Uint256 type should be used. This would allow for explicit and therby cleaner and more transparent conversions.

Recommendations

We recommend to clearly document which functions flip orders of bytes and, on conversion, which endianess is used.

We also recommend to resolve the current discrepancy with regards to ordering of the bytes/bits between ConstBytes32 and the toBinaryVars and FromBinary functions.

The option we would suggest would be to use a new type Uint256 for use cases such as SlotOfArrayElement. It could be documented that this type stores its data in little endian. If the current ConstBytes32 copied for use for Uint256 were then named to something like ConstFromBigEndianBytes, then it would be transparent how this type behaves. As it takes arguments in big-endian order but stores data in little-endian order, it will reverse the order, which is as expected then. The ConstBytes32 function for Bytes32 should in this case be changed to not flip the order of the bytes, to establish compatibility with toBinaryVars and FromBinary.

An alternative would be to change ConstBytes32 to store the first byte in Val[0] and the last 31 bytes in Val[1]. Those 31 bytes should be stored so that the least significant eight bits of Val[1] correspond to byte 31. If one does it this way, then Val[0] and Val[1] would be ordered in the expected way, with Val[0] holding lower indexed bytes than Val[1], and compatibly with toBinaryVars and FromBinary, if they are changed to take into account that Val[1] now stores a list of bytes in big endian instead of little-endian order as before. Additionally, it would still be possible to do the addition needed in SlotOfArrayElement by just adding one slot (now Val[1]).

Remediation

In , Brevis renamed he ConstBytes32 function to ConstFromBigEndianBytes.

Endianess for Bytes32

Description

Impact

Recommendations

Remediation

Endianess for `Bytes32`