Assessment reports>Universal Proof Aggregator>Critical findings>Verification-batching implementation unsound
Category: Coding Mistakes

Verification-batching implementation unsound

Critical Severity
High Impact
Medium Likelihood

Summary

The universal batch verifier verifies a batch of Groth16 proofs. Instead of verifying each proof individually, they are combined so that only an equality needs to be checked, which is more efficient. This finding concerns the assumptions under which this combined check implies validity of all the individual proofs; the current codebase violates these assumptions in a subtle manner, making the usual security proof for the batched check inapplicable. It is plausible that this could also in practice allow an attacker to construct a batch including an invalid proof for a legitimate (not chosen by the attacker) verification key but with UniversalBatchVerifierChip still verifying the batch.

Review of how Groth16 proofs are verified

To describe this finding, let us begin by reviewing how individual Groth16 proofs are verified and how the batching process works abstractly, following the specification in spec/circuits/universal_batch_verifier.md.

We are going to mostly follow the notational conventions from the specification and denote by the pairing over the BN254 elliptic curve. In contrast to the specification, we will write and but also additively. The scalar field of BN254 will be denoted by .

A Groth16 circuit comes with a verification key, which consists of elements of the groups and , where is the number of public inputs. Concretely, a verification key is of the form .

An instance must then provide the public inputs, which are elements .

A proof that proves knowledge of witnesses to the circuit with given public inputs then consists of only three group elements: .

A proof is valid if it satisfies the following pairing check,

where . By defining and rearranging the above equation, we can equivalently check the following.

Batching equality checks

Let us denote the left-hand side of the above equality check by . So, to verify a single proof, we must check that in .

Now suppose we are given a batch of proofs that we are to verify. We will keep notation as before but subscripts to . To verify all of those proofs, we must check that holds for every .

Given an element , we instead consider whether holds. Clearly, if holds for every , then will hold as well. What about if does not hold for every , how likely is it then that still holds? We claim that this is very unlikely if is a uniformly random element.

To show this, note that there is an isomorphism as -modules. Thus, is equivalent to . This can be rephrased as being a root of the polynomial . This polynomial is not zero by assumption (as at least one of the coefficients is nonzero) and has maximum degree . There can thus be at most many distinct roots, and hence, as is a uniformly random element of , the chance that is a root is at most .

The above shows that as long as is significantly smaller than and is chosen uniformly random, with the values not depending on , then the equality is strong evidence for holding for every .

It is important to stress how essential it is that the values do not depend on . If were chosen first, but an attacker can still change , then they could, for example, fix to whatever value they want and then set , which would make pass the check.

We can summarize the discussion so far as follows. We consider the following protocol to verify a batch of proofs:

  1. We are sent the verification keys, public inputs, and proofs.

  2. We choose a uniformly random .

  3. We accept if and reject otherwise.

What we argued for is that acceptance in step 3 is very unlikely to happen unless all individual proofs were valid.

In practice, such as in the universal batch verifier circuit, it may not be possible to make a uniformly random choice, but must be deterministically derived somehow. In such cases, we might use a hash function , which we model as a random oracle. The function produces output in that is uniformly random for each input, independently from the value at other inputs. However, evaluating again on the same input will produce the same output again.

We can then use this modified protocol to verify a batch of proofs:

  1. We are sent the verification keys, public inputs, and proofs.

  2. We calculate from the values received, and then .

  3. We accept if and reject otherwise.

For any fixed values sent in step 1, the situation is as before: is uniformly random, and furthermore the dependence on means that the polynomial cannot be changed without changing the challenge point to a different uniformly random value. Thus, it is very unlikely that we will accept unless all proofs were valid.

In contrast to the previous variant, however, is now predictable by the attacker. They can thus try locally whether their incorrect proofs will be rejected and try again and again before submitting. In this variant, there is thus the option to brute force locally (i.e., the attacker can keep trying different invalid proofs and check whether verification would succeed, until they find an invalid batch that nevertheless succeeds). If each verification is computationally intensive enough in combination with being large enough, finding an invalid batch like that would still be infeasible for the attacker, though.

Finally, it is not necessary for to be obtained by applying to . We can instead obtain by applying to values from which are deterministically computed. Then it will still not be possible to change without the challenge point changing to a new random value as well.

The batched check in the universal batch verifier chip

The universal batch verifier chip followed the above recipe, using the following for .

The verification is then done split up into steps (the numbering of the steps follows the specification). In step 4, the values are calculated. This is implemented by the function compute_pi_pairs.

In step 5, implemented in compute_pairs (after a call to compute_pi_pairs to handle the previous step), the inputs to the pairing are scaled, calculating , , , and .

In step 6, the multipairing is calculated to obtain the following value.

The function verify_with_challenge is the one that calls the multi_pairing of the batch verifier chip in order to calculate this from the list of group elements obtained before. It then calls check_pairing_result to check that the resulting value is (in our additive notation, in the code it is due to use of multiplicative notation).

Let us assume the following.

  • Assumption 1: are elements of , and are elements of , and are elements of for all and .

  • Assumption 2: The functions mentioned above implement the mathematical description mentioned above.

Then this is a secure way to perform a batch check, in the way discussed in the previous subsection. Indeed, we can expand the value that is checked against as follows:

Furthermore, all values that depend on are hashed to obtain .

Relevant assumptions made by halo2-ecc's EccChip

Let us go through the functions of the external (and out of scope) halo2-ecc crate's EccChip and list the assumptions they make for correct functioning.

The function calls to halo2-ecc functions or chips by the UniversalBatchVerifierChip that we are going to analyze are the following:

  • Call of variable_base_msm by compute_pi_pairs in order to calculate a partial sum for (part of step 4).

  • Call of sum by compute_pi_pairs in order to calculate from the previous partial sum and an additional summand (part of step 4).

  • Indirect call of scalar_multiply by compute_pairs in order to scale the inputs to the pairing with powers of the challenge (step 5).

  • Indirect usage of the halo2-ecc PairingChip by verify_with_challenge.

Calculation of the right summand for

For calculation of , the relevant code in the compute_pi_pairs is the following:

let rhs = ec_chip.variable_base_msm::<G1Affine>(
    builder,
    &ss[1..], // We skip the first element
    inputs,
    F::NUM_BITS as usize,
);
let pi_term = ec_chip.sum::<G1Affine>(
    builder.main(0),
    once(rhs).chain(once(ss[0].clone())),
);

Let us first focus on the use of variable_base_msm to calculate . Here, ss[1..] is and . The function eventually calls a function documented as having these assumptions:

/// # Assumptions
/// * `points.len() == scalars.len()`
/// * `scalars[i].len() == scalars[j].len()` for all `i, j`
/// * `scalars[i]` is less than the order of `P`
/// * `scalars[i][j] < 2^{max_bits} for all j`
/// * `max_bits <= modulus::<F>.bits()`, and equality only allowed when the order of `P` equals the modulus of `F`
/// * `points` are all on the curve or the point at infinity
/// * `points[i]` is allowed to be (0, 0) to represent the point at infinity (identity point)
/// * Currently implementation assumes that the only point on curve with y-coordinate equal to `0` is identity point

In the relevant case, max_bits == modulus::<F>.bits(). The fifth point would then seem to disallow the points from not having the full order (so ruling out the point at infinity). It appears plausible that this is an inaccuracy in the documentation here, and what is required is only that for all input points , which would also be satisfied by zero (the point at infinity).

What is more relevant is the requirement that all points lie on the curve or are the point at infinity. This assumption is not enforced by the UniversalBatchVerifierChip. An attacker can thus use arbitrary pairs for that may not satisfy the curve equation.

Calculation of

Then, the EccChip's sum function is used to calculate the sum of with the previously calculated value. It makes the assumption that both summands are valid points on the curve and not the point at infinity. As mentioned before, are not constrained and choosable by the attacker, so this assumption is not necessarily satisfied for the first summand. As the second summand is the return value of variable_base_msm, and as just discussed, the assumptions for its argument may also be broken, the second summand may also not satisfy the assumptions made by sum. Furthermore, the second summand might legitimately be the point at infinity, which is also not allowed as input for sum. If the attacker uses invalid values for , as discussed, or causes the second summand to be the point at infinity, we can then not rely on to be a well-formed point on the curve either.

Scaling

The calculation of , , , and is done using a call to the EccChip's scalar_multiply function:

scalar_multiply::<_, FpChip<F, Fq>, G1Affine>(
    self.fp_chip,
    ctx,
    g1.clone(),
    vec![*scalar],
    F::NUM_BITS as usize,
    WINDOW_BITS,
)

That function makes the following assumptions:

/// # Assumptions
/// - `window_bits != 0`
/// - The order of `P` is at least `2^{window_bits}` (in particular, `P` is not the point at infinity)
/// - The curve has no points of order 2.
/// - `scalar_i < 2^{max_bits} for all i`
/// - `max_bits <= modulus::<F>.bits()`, and equality only allowed when the order of `P` equals the modulus of `F`

This is similar to the two previous functions. Here, again, P must be a valid point on the curve, and the point at infinity is not allowed. Now the proof points and are checked to be on the curve and not the point at infinity. However, may be chosen arbitrarily by the attacker, and is the result of operations whose inputs may not satisfy the required assumptions discussed above. Thus, we cannot rely on and to be well-formed points on the curve.

Multipairing

The halo2-ecc PairingChip is then used to calculate the multipairing . This chip is somewhat less documented, but the comments for miller_loop_BN suggest that it is an assumption that the points are on the curve and that the points on the side are not the zero point. Again, an attacker can freely choose and , breaking those assumptions. Furthermore, as discussed in the previous points, and might not be well-formed points on the curve should the attacker have chosen and to not be well-formed.

The upshot is that if the attacker manipulates verification keys in the batch, then the verification result deviates from its normal specification/interpretation. The previously discussed security proof for why the batched verification implies validity of the individual proofs thus does not apply anymore.

Impact assuming the check only depends on verification key, public inputs, and proofs

Let us for a moment assume that the above functions of the EccChip and PairingChip constrain their return values deterministically from their arguments even if the arguments do not satisfy the expected assumptions.

If the batch consisted of only a single proof, then the above suggests that it may be possible to construct a proof for the UniversalBatchVerifierChip circuit that accepts an incorrect proof for a malformed verification key. This is unlikely to be useful to the attacker, as useful attacks likely require being able to convince another party of the validity of a proof for a given circuit, for which the attacker can then not change the verification key.

The possibility to consider is thus whether an attacker would be able to construct a batch of, for example, two proofs that get accepted by the UniversalBatchVerifierChip circuit — with one proof being an incorrect proof for a legitimate verification key and the second proof using a malformed verification key that the attacker chose specifically so that the batch would be accepted. In such a case, the input pairs to the multipairing corresponding to the legitimate proof would be as intended. In pseudocode, the multipairing would then compute something like

out_of_spec_multipairing(legitimate_pairs, out_of_spec_scalar_multiplication_on_left_side(c, malformed_attacker_controlled_pairs))

or

out_of_spec_multipairing(malformed_attacker_controlled_pairs, scalar_multiplication_on_left_side(c, legitimate_pairs))

As we are assuming at the moment that these attacker-controlled pairs depend only on the verification key, public inputs, and proof, the attacker cannot change them without causing c to change as well. While we cannot reason about out_of_spec_multipairing and out_of_spec_scalar_multiplication_on_left_side without unpacking the implementation, it seems plausible that it might be infeasible in practice for an attacker to construct a malformed verification key making such a batch pass, even though the proof for the well-formed verification key is invalid. However, as this cannot be relied upon without deep analysis of the implementations in halo2-ecc, it should still be ensured that all assumptions made by the functions from halo2-ecc used are satisfied.

Nondeterminism

In the previous section, we briefly discussed the impact, assuming that the used functions of the EccChip and PairingChip constrain their return values deterministically from their arguments even if the arguments do not satisfy the expected assumptions. However, this assumption is actually not true, as we will now see.

The sum function

Let us begin by looking into the implementation of sum in the EccChip of halo2-ecc:

/// None of elements in `points` can be point at infinity.
pub fn sum<C>(
    &self,
    ctx: &mut Context<F>,
    points: impl IntoIterator<Item = EcPoint<F, FC::FieldPoint>>,
) -> EcPoint<F, FC::FieldPoint>
where
    C: CurveAffineExt<Base = FC::FieldType>,
{
    let rand_point = self.load_random_point::<C>(ctx);
    let rand_point = into_strict_point(self.field_chip, ctx, rand_point);
    let mut acc = rand_point.clone();
    for point in points {
        let _acc = self.add_unequal(ctx, acc, point, true);
        acc = into_strict_point(self.field_chip, ctx, _acc);
    }
    self.sub_unequal(ctx, acc, rand_point, true)
}

Note that instead of directly computing the sum, this function starts out with a (intended random) point that is subtracted again at the end. The reason for this is that the general formulas for elliptic curve addition with points in affine coordinates do not apply if the two points are equal or additive inverses of each other. To be able to use only a single formula for addition, a trick is thus used: if is a random point, then it is unlikely that would be equal to or that would be equal to , so by performing the calculation as , most special cases can be avoided. The one special case that is not avoidable is , as then one would have . See Finding ref.

The random point is loaded by the following function:

pub fn load_random_point<F, FC, C>(chip: &FC, ctx: &mut Context<F>) -> EcPoint<F, FC::FieldPoint>
where
    F: PrimeField,
    FC: FieldChip<F>,
    C: CurveAffineExt<Base = FC::FieldType>,
{
    let base_point: C = C::CurveExt::random(ChaCha20Rng::from_entropy()).to_affine();
    let (x, y) = base_point.into_coordinates();
    let base = {
        let x_overflow = chip.load_private(ctx, x);
        let y_overflow = chip.load_private(ctx, y);
        EcPoint::new(x_overflow, y_overflow)
    };
    // for above reason we still need to constrain that the witness is on the curve
    check_is_on_curve::<F, FC, C>(chip, ctx, &base);
    base
}

As can be seen here, on witness generation, the point is generated randomly. An attacker must not follow this, however, and can choose specific values. The only in-circuit constraint imposed here is that the point is actually on the curve. The check_is_on_curve function checks that the pair of coordinates satisfies the curve equation, so the point at infinity is ruled out here.

The implementations of ec_add_unequal and ec_sub_unequal (which are the functions ultimately called by add_unequal and sub_unequal) do not treat , which is used to represent the point at infinity, in a separate way, as can be seen here in the example of addition:

// Implements:
//  Given P = (x_1, y_1) and Q = (x_2, y_2), ecc points over the field F_p
//      assume x_1 != x_2
//  Find ec addition P + Q = (x_3, y_3)
// By solving:
//  lambda = (y_2-y_1)/(x_2-x_1) using constraint
//  lambda * (x_2 - x_1) = y_2 - y_1
//  x_3 = lambda^2 - x_1 - x_2 (mod p)
//  y_3 = lambda (x_1 - x_3) - y_1 mod p
//
/// If `is_strict = true`, then this function constrains that `P.x != Q.x`.
/// If you are calling this with `is_strict = false`, you must ensure that `P.x != Q.x` by some external logic (such
/// as a mathematical theorem).
///
/// # Assumptions
/// * Neither `P` nor `Q` is the point at infinity (undefined behavior otherwise)
pub fn ec_add_unequal<F: PrimeField, FC: FieldChip<F>>(
    chip: &FC,
    ctx: &mut Context<F>,
    P: impl Into<ComparableEcPoint<F, FC>>,
    Q: impl Into<ComparableEcPoint<F, FC>>,
    is_strict: bool,
) -> EcPoint<F, FC::FieldPoint> {
    let (P, Q) = check_points_are_unequal(chip, ctx, P, Q, is_strict);

    let dx = chip.sub_no_carry(ctx, &Q.x, &P.x);
    let dy = chip.sub_no_carry(ctx, Q.y, &P.y);
    let lambda = chip.divide_unsafe(ctx, dy, dx);

    //  x_3 = lambda^2 - x_1 - x_2 (mod p)
    let lambda_sq = chip.mul_no_carry(ctx, &lambda, &lambda);
    let lambda_sq_minus_px = chip.sub_no_carry(ctx, lambda_sq, &P.x);
    let x_3_no_carry = chip.sub_no_carry(ctx, lambda_sq_minus_px, Q.x);
    let x_3 = chip.carry_mod(ctx, x_3_no_carry);

    //  y_3 = lambda (x_1 - x_3) - y_1 mod p
    let dx_13 = chip.sub_no_carry(ctx, P.x, &x_3);
    let lambda_dx_13 = chip.mul_no_carry(ctx, lambda, dx_13);
    let y_3_no_carry = chip.sub_no_carry(ctx, lambda_dx_13, P.y);
    let y_3 = chip.carry_mod(ctx, y_3_no_carry);

    EcPoint::new(x_3, y_3)
}

If at least one of the summands is not a valid affine point on the curve, ec_add_unequal will return an affine point deterministically calculated from the input, but without interpretation in terms of addition on the elliptic curve (similarly for ec_sub_unequal).

As discussed before, in the calculation of , the above sum function is used and an attacker can make one or both of the summands the point at infinity, which is represented by . The important thing to note now is that the incorrect result returned by ec_add_unequal and ec_sub_unequal for invalid arguments do not behave in the usual way such that the random point cancels out. This means that calling sum with one of the summands being will return a result that is dependent on the random point being loaded by sum itself.

This dependence can be verified by the following SageMath script:

### bn254
p = 0x30644e72e131a029b85045b68181585d97816a916871ca8d3c208c16d87cfd47
K = GF(p)
a = K(0x0000000000000000000000000000000000000000000000000000000000000000)
b = K(0x0000000000000000000000000000000000000000000000000000000000000003)
E = EllipticCurve(K, (a, b))
r = 0x30644e72e131a029b85045b68181585d2833e84879b9709143e1f593f0000001
#G = E(0x2523648240000001BA344D80000000086121000000000013A700000000000012, 0x0000000000000000000000000000000000000000000000000000000000000001)
#E.set_order(0x2523648240000001BA344D8000000007FF9F800000000010A10000000000000D * 0x01)


# The following two functions reflect the implementation in the halo2-ecc
# EccChip.

# ec_add_unequal
def add(P, Q):
  (x_1, y_1) = P
  (x_2, y_2) = Q
  assert x_1 != x_2
  lam = (y_2-y_1)/(x_2-x_1)
  x_3 = lam^2 - x_1 - x_2
  y_3 = lam*(x_1 - x_3) - y_1
  return (x_3, y_3)

# ec_sub_unequal
def sub(P, Q):
  (x_1, y_1) = P
  (x_2, y_2) = Q
  assert x_1 != x_2
  lam = -(y_2+y_1)/(x_2-x_1)
  x_3 = lam^2 - x_1 - x_2
  y_3 = lam*(x_1 - x_3) - y_1
  return (x_3, y_3)


# Sanity check: for random legitimate points on the curve, the above functions
# should be addition / subtraction
P = E.random_point()
Q = E.random_point()
assert ((P+Q).xy() == add(P.xy(), Q.xy()))
assert ((P-Q).xy() == sub(P.xy(), Q.xy()))

def is_on_curve(P):
  return P[1]**2 == P[0]**3 + 3
assert is_on_curve(E.random_point().xy())

# Tests whether f (that takes a random value as input) is independent of it,
# and whether the output is on the curve.
def test_constant_and_on_curve(f):
  R1, R2 = (E.random_point().xy() for _ in range(2))
  f1 = f(R1)
  f2 = f(R2)
  return (f1 == f2, is_on_curve(f1))

def sum_func(P, Q):
  def helper(R):
    return sub(add(add(R, P), Q), R)
  return helper

def test_left_zero():
  P = (K(0),K(0))
  Q = E.random_point().xy()
  return test_constant_and_on_curve(sum_func(P, Q)) 

def test_right_zero():
  P = E.random_point().xy()
  Q = (K(0),K(0))
  return test_constant_and_on_curve(sum_func(P, Q)) 

def test_none_zero():
  P = E.random_point().xy()
  Q = E.random_point().xy()
  return test_constant_and_on_curve(sum_func(P, Q)) 

# If both points are valid, then sum does not depend on the random point and
# the output is on the curve
assert test_none_zero() == (True, True)
# but if one of the two summands is (0,0), then the output does depend on the
# random point, and the output may not be on the curve
assert test_left_zero() == (False, False) 
assert test_right_zero() == (False, False) 

The variable_base_msm and scalar_multiply functions

The variable_base_msm and scalar_multiply functions are ultimately implemented by either by multi_scalar_multiply or pippenger::multi_exp_par. Both use a random-point trick analogous to what was discussed for sum. In contrast to sum, both functions are intended to deal correctly with as representing the point at infinity, according to the documentation. While we did not test this, it appears highly likely that the output will be dependent on the random point, however, should one or more of the input points fail to satisfy the curve equation while being unequal to .

Impact of nondeterminism

The dependence of the result of the multipairing on these points in the cases just discussed means that an attacker still has parameters they can change after the challenge has been fixed. The case of a malformed verification key could be dealt with by rejecting a verified batch out-of-circuit whenever one of the included verification keys is malformed. However, as discussed before, the attacker can also introduce dependence on such a point by setting all public inputs to zero. For each proof in the batch for which the attacker does this, they essentially gain one degree of freedom to influence the result of the multipairing. Without unpacking in detail the implementation of the relevant functions, including the multi-Miller loop, it is difficult to say how feasible it will be for an attacker to make choices for these points with which they can make the universal batch verifier circuit accept a batch that includes an invalid proof for a legitimate (not attacker chosen) verification key. However, it appears plausible that this could be possible.

Recommendations

All points in the verification key should be checked to actually lie on the curve. As the universal batch verifier circuit exposes these points as public instances, this does not necessary need to be done in circuit, but in that case, it should be documented clearly that this must be checked outside.

Furthermore, the possibility that sum might be called in the compute_pi_pairs function with the second summand being the point at infinity should be removed.

One way to do that would be to use variable_base_msm with points ss and scalars [1] + input to calculate the entire sum directly, rather than calling it with only points ss[1..] and scalars input and then using sum to add ss[0] to this intermediate result.

Remediation

This issue has been acknowledged by Nebra, and fixes were implemented in the following commits:

Zellic © 2025Back to top ↑