## Endianess mismatch between Keccak circuit specification and implementation

The function `compose_into_field_elements`

in `circuits/src/keccak/utils.rs`

is implemented as follows:

```
/// Composes `bytes` into two 16-byte field elements. Each field element in `bytes`
/// is assumed to have been previously range checked.
///
/// # Specification
///
/// This function performs **Step 5: compose into field elements**
/// in the variable length keccak spec.
pub fn compose_into_field_elements<F: EccPrimeField>(
ctx: &mut Context<F>,
chip: &RangeChip<F>,
bytes: &[AssignedValue<F>; 32],
) -> [AssignedValue<F>; 2] {
let byte_decomposition_powers = byte_decomposition_powers()
.into_iter()
.map(|power| QuantumCell::from(ctx.load_constant(power)))
.collect_vec();
let result = bytes
.iter()
.rev()
.chunks(16)
.into_iter()
.map(|chunk| {
chip.gate.inner_product(
ctx,
chunk.into_iter().cloned().map(QuantumCell::from),
byte_decomposition_powers.clone(),
)
})
.collect::<Vec<_>>();
result
.try_into()
.expect("Conversion from vec into array is not allowed to fail")
}
```

The function `byte_decomposition_powers`

in turn is implemented as follows:

```
/// Returns an iterator of field elements over the powers of 2^8.
pub(crate) fn byte_decomposition_powers<F: EccPrimeField>() -> Vec<F> {
let num_bytes = num_bytes::<F>();
let mut powers = Vec::<F>::with_capacity(num_bytes);
let two_to_eight = F::from(1 << BYTE_SIZE_IN_BITS);
powers.push(F::one());
for i in 1..num_bytes {
powers.push(powers[i - 1] * two_to_eight);
}
powers
}
```

As can be seen here, the return value of `byte_decomposition_powers`

is the list of powers of $2_{8}$ in increasing order — so starting with $1,2_{8},(2_{8})_{2},…$.

Let us denote the components of `bytes`

by $b_{0},…,b_{31}$. Then, the `bytes`

array back in `compose_into_field_elements`

is first reversed by `rev()`

, resulting in an iterator over $[b_{31},…b_{0}]$. Chunking results in chunks $[b_{31},…b_{16}]$ and $[b_{15},…b_{0}]$. The mapped call of `inner_product`

then takes the inner product of each chunk with $[1,2_{8},(2_{8})_{2},…,(2_{8})_{1}5]$. The result will thus consist of $F_{1}=∑_{k=0}b_{16+15−k}⋅2_{8⋅k}$ and $F_{2}=∑_{k=0}b_{15−k}⋅2_{8⋅k}$. This does not match the specification for step 5 of the variable-length Keccak circuit, which (with slightly different notation) suggests that $F_{1}=∑_{k=1}b_{k}⋅2_{8⋅k}$ instead.

Apart from the sum in the specification starting with $1$ instead of $0$, the discrepancy can be summarized as the implementation treating the bytes as being given in big endian, whereas the specification suggests little endian.

The discrepancy discussed above was corrected in the specification in commit