Assumptions made regarding log topics are not correct in full generality
Description
In Ethereum, log entries consist of
the address that originated the log entry
zero or more 32-byte log topics
data of zero or more bytes of length
This is specified in section 4.4.1 of the Ethereum yellow paper↗ (version 9fde3f4 from September 2nd, 2024).
The EVM offers five instructions to produce log entries, LOG0
, LOG1
, LOG2
, LOG3
, and LOG4
, with the number indicating the number of topics attached to the log entry. There is no restriction on what the content of the topics might be, so contracts are free to use any 32 bytes they wish as a topic.
Solidity offers a higher level abstraction around logs, called events↗. Events have a name and a list of named arguments that can be of different types. These arguments can be declared indexed or not, and the entire event can be declared anonymous or not. Emitting an event then emits a log entry that can be roughly described as follows:
If the event is not anonymous, then the first topic will be a hash of the event's signature.
The remaining topics are, in order, filled with the indexed fields of the event (the hash of the field, if the type is bigger than 32 bytes). Accordingly, anonymous events can have up to four indexed fields, whereas non-anonymous events can have only three.
The non-indexed fields are encoded together and used as the data part of a log entry.
Commonly, nonanonymous events are used, and direct usage of LOGn
opcodes is rare, though it does occur, for example in the DAI contract↗.
Brevis allows proving correctness of receipts, which may include some log entries. For the fields of a log entry, the following type is used, defined in sdk/circuit_input.go of the brevis-sdk repository:
// LogField represents a single field of an event.
type LogField struct {
// The contract from which the event is emitted
Contract Uint248
// The event ID of the event to which the field belong (aka topics[0])
EventID Uint248
// Whether the field is a topic (aka "indexed" as in solidity events)
IsTopic Uint248
// The index of the field. For example, if a field is the second topic of a log, then Index is 1; if a field is the
// third field in the RLP decoded data, then Index is 2.
Index Uint248
// The value of the field in event, aka the actual thing we care about, only 32-byte fixed length values are supported.
Value Bytes32
}
Here, the value of one of the fields of an event is stored in Value
, which is of type Bytes32
. The first topic is stored as well, in a field called EventID
, which is only of type Uint248
, which can thus not describe uniquely the 32 bytes of the topic. In practice, however, it appears that instead of 248 bits, only the first six bytes (so 48 bits) of the first topic are actually stored in the EventID
. The pack
function in sdk/circuit_input.go mentions the following reasoning:
// pack packs the log fields into Bn254 scalars
// 4 + 3 * 59 = 181 bytes, fits into 6 fr vars
// 59 bytes for each log field:
// - 20 bytes for contract address
! // - 6 bytes for topic (topics are 32-byte long, but we are only using the first 6 bytes distinguish them.
// 6 bytes gives a per-contract 1/2^48 chance of two different events having the same topic)
// - 1 bit for whether the field is a topic
// - 7 bits for field index
// - 32 bytes for value
func (r Receipt) pack(api frontend.API) []frontend.Variable {
var bits []frontend.Variable
bits = append(bits, api.ToBinary(r.BlockNum.Val, 8*4)...)
for _, field := range r.Fields {
bits = append(bits, api.ToBinary(field.Contract.Val, 8*20)...)
! bits = append(bits, api.ToBinary(field.EventID.Val, 8*6)...)
bits = append(bits, api.ToBinary(field.IsTopic.Val, 1)...)
bits = append(bits, api.ToBinary(field.Index.Val, 7)...)
bits = append(bits, field.Value.toBinaryVars(api)...)
}
return packBitsToFr(api, bits)
}
This is likely done to save field elements in the packed representation of a Receipt
.
The given argument applies to the case of nonanonymous events: as the first topic consists of a 256-bit hash of the event's signature, we should be able to treat the first 48 bits of this hash as another, shorter hash function, obtaining a chance of only that two given events have the same 48-bit hash.
However, this argument only applies for nonanonymous events emitted by Solidity contracts. For anonymous events, or log entries created without usage of Solidity events, the first topic may not consist of the output of a hash function or otherwise have a relevant low likelihood of collisions. For example, a hypothetical smart contract might use the first topic to store a contract version and event identifier, with the contract version making up the first six or more bytes. In that case, all such events will collide with each other with regards to the first six bytes of their first topic.
We are not aware of any deployed smart contracts that actually use logs in a way that would cause such collisions, however.
Impact
Storing only the first six bytes of the first topic of a log entry in order to identify event types without collision may not be safe for log entries generated by methods other than Solidity's nonanonymous events.
The precise impact of this depends on handling of log entries in Brevis's backend circuits, which were not part of the audit scope at the current phase.
Recommendations
We recommend to carefully check how the project handles log entries that are not nonanonymous events emitted by Solidity. As such logs are much rarer than nonanonymous Solidity events, it may be reasonable to reject them. If they are not rejected, but collision resistance for the first topic field is relied upon, then the entire 32 bytes should be stored, rather than just the first six bytes.
Remediation
Brevis informed us that they were keeping the current implementation, in acknowledgement of the tradeoff between saving field elements in the packed representation of a Receipt, and supporting contracts emitting rarer types of log entries, as standard Solidity events cover most of the cases that Brevis SDK supports.