Assessment reports>Scroll zkEVM>Discussion>More robust constraints ensuring that all rw table entries are legitimate

More robust constraints ensuring that all rw table entries are legitimate

Here we discuss a possible way to make the circuits more robust against the kind of issue discussed in Finding ref.

We recall that the rw table is used to keep track of the consistency of read and write operations to, for example, storage, memory, and stack. This table is constrained by the state circuit, which ensures that, for example, reads that follow a write to some location will read the value written, that reads do not change the value, and so on. The state circuit does not however check that the entries are actually legitimate, so from the perspective of the state circuit only, any writes may appear in the table. Non-Start entries of the rw table are intended to have a unique ID, given by the counter, which is to increment on each read or write. Entries of the Start type have separate counters, counting up on each row. The number of rows that the state circuit has is fixed, and they are lexicographically ordered, first by the type, and those of type Start come first. The table is padded by adding Start rows, and rw_counter is constrained to increase by one between any two Start rows.

The end block gadget in the EVM circuit is performing lookups of Start rows in the rw table as follows (EndBlockGadget::configure function in zkevm-circuits/src/evm_circuit/execution/end_block.rs):

let max_rws = cb.query_copy_cell();

// ...

// If the block is empty, we do 0 rw_table lookups
// If the block is not empty, we will do 1 call_context lookup
// and add 1 withdraw_root lookup
let total_rws = not::expr(is_empty_block.expr())
    * (cb.curr.state.rw_counter.clone().expr() - 1.expr() + 1.expr())
    + 1.expr();

// ...

// 3. Verify rw_counter counts to the same number of meaningful rows in
// rw_table to ensure there is no malicious insertion.
// Verify that there are at most total_rws meaningful entries in the rw_table
cb.rw_table_start_lookup(1.expr());
cb.rw_table_start_lookup(max_rws.expr() - total_rws.expr());
// Since every lookup done in the EVM circuit must succeed and uses
// a unique rw_counter, we know that at least there are
// total_rws meaningful entries in the rw_table.
// We conclude that the number of meaningful entries in the rw_table
// is total_rws.

Here, max_rws is intended to be the fixed total number of rows in the rw table when including padding and total_rws the number of rw rows that the EVM circuit tracked as having looked up (perhaps through the copy circuit). It is looked up that a Start row with rw_counter = 1 and a Start row with rw_counter = max_rws - total_rws exist. As max_rws - total_rws should be positive and small enough that no wraparound can happen, there thus must be at least max_rws - total_rws many Start rows. This leaves at most total_rws many rows for everything else. Thus, if there really were total_rws many different legitimate rw lookups, then this ensures that there are no additional illegitimate ones in the rw table.

However, this hinges on cb.curr.state.rw_counter actually accurately reflecting the number of legitimate rw lookups done. But there are many different places in which rw lookups are done, including not in the EVM circuit itself, as in the case of copies where the rw lookups are done by the copy circuit. This means there are many places in which a mistake could mean that the rw_counter is not correctly tracked in a state transition, thereby allowing malicious writes to be inserted into the rw table. Finding ref is an example of this, where cb.curr.state.rw_counter does not track the number of legitimate rw lookups correctly. When there is a mistake such that the tracked counter is increased too much in one of the EVM circuit gadgets, then this will always be a critical issue due to allowing to add arbitrary illegitimate rw entries. For defense in depth, it could be considered to have a more robust catch-all check rather than relying on the rw_counter being tracked correctly in the many different gadgets.

What would prevent illegitimate rw rows from being inserted without relying on rw_counter tracking being fully correct for every state transition would be to directly ensure that there is no non-Start row in the rw table that has not been looked up legitimately. There can be multiple ways to implement this. Here we discuss one style of idea of how this could be done.

Let be a hash function acting as a random oracle. What is described below will need to be done in a phase after the rw table and all lookups in it have been assigned, and it needs to use a challenge that depends on those.

In the EVMConstraintBuilder::rw_lookup function (the function all lookups to the rw table in the EVM circuit go through), calculate a hash of the lookup being done together with the challenge, so where is all the data associated to the lookup, and add this to an accumulator. At the end of execution, the EVM circuit will then have calculated the sum of hashes for all rw lookups done by the EVM circuit (i.e., where is the set of all rw lookups done by the EVM circuit). Similarly in the copy circuit, also calculate the sum of hashes for rw lookups being done; where is the set of all rw lookups done by the copy circuit. Both circuits expose this value to the super circuit, which adds them up and thereby obtains the sum over the hash of each actually looked-up rw row, . The state circuit should then similarly calculate such a sum over all non-Start rows and expose it, so , where is the set of rows in the rw table. Finally, the super circuit checks that these two values agree, .

The first attacker model we consider here is where an attacker cannot modify which legitimate rw lookups are being done by the EVM and copy circuit, but they can cause the counter tracking to be off, so they can add additional rw entries. As the legitimate lookups must succeed, passing the above new check would then amount to finding illegitimate rw entries such that . But c is a hash of, among other inputs, the rw table, and that includes these illegitimate entries. All summands of the left-hand side are thus unpredictable random values until all have been fixed. This should imply that the attacker can do no better than using brute force to search for a positive number illegitimate entries where the left-hand side sums to zero, which should then be computationally intractable. Thus, the only feasible solution the prover could have used should be the one where there are zero summands (i.e., no illegitimate entries).

The second attacker model would be if the attacker can make two legitimate lookups actually look up the same entry in the table, as in the second variant discussed in Finding ref. This should also be stopped by the above check; this time, the attacker would need to satisfy , where is the multiset of duplicate lookups (if the same row is looked up times, then the entry should occur times in ). This should again be infeasible for the same reasons as in the first scenario. Note that this kind of attack is one that will not be caught by counter tracking as it is implemented even if the rw counter deltas are all correct, because the attacker "saves" one lookup they can then use for a smuggled entry.

The above suggestion is a starting point for the possibility of a more robust way of ensuring that no illegitimate entries occur in the rw table. This particular suggestion might not be implementable in practice due to, for example, performance requirements, as this would require a large number of in-circuit hashes.

Zellic © 2024Back to top ↑