Concrete examples for assumptions for bounds and how to document them
There are many implicit assumptions throughout the codebase regarding bounds that are undocumented but required for correct functioning. We have generally only mentioned those instances in this report, which plausibly could cause a problem with likely usage, without a more serious issue occurring as well. In this section, in expansion of what was discussed in ref↗, we discuss two concrete representative examples of the kind of implicit assumptions scattered throughout the codebase. For the second example, we discuss how one might specify and document the code with the assumptions that the caller should ensure.
Example for common implicit assumptions on bounds
The unsafe_evaluate_multiple_multiply_add
contains the following calculations, done with types that are 512-bits wide:
max_d0 += (neg_modulus_limbs_u256[3] * quotient.binary_basis_limbs[0].maximum_value);
uint512_t max_d1 = (left.binary_basis_limbs[2].maximum_value * to_mul.binary_basis_limbs[1].maximum_value);
max_d1 += (neg_modulus_limbs_u256[2] * quotient.binary_basis_limbs[1].maximum_value);
uint512_t max_d2 = (left.binary_basis_limbs[1].maximum_value * to_mul.binary_basis_limbs[2].maximum_value);
max_d2 += (neg_modulus_limbs_u256[1] * quotient.binary_basis_limbs[2].maximum_value);
uint512_t max_d3 = (left.binary_basis_limbs[0].maximum_value * to_mul.binary_basis_limbs[3].maximum_value);
max_d3 += (neg_modulus_limbs_u256[0] * quotient.binary_basis_limbs[3].maximum_value);
uint512_t max_r0 = left.binary_basis_limbs[0].maximum_value * to_mul.binary_basis_limbs[0].maximum_value;
max_r0 += (neg_modulus_limbs_u256[0] * quotient.binary_basis_limbs[0].maximum_value);
// @AUDIT: these are the maximums of the convolution product limbs of left*to_mul + (-p)*quotient
const uint512_t max_r1 = max_b0 + max_b1;
const uint512_t max_r2 = max_c0 + max_c1 + max_c2;
const uint512_t max_r3 = max_d0 + max_d1 + max_d2 + max_d3;
Each of max_d0
, max_d1
, max_d2
, and max_d3
contain a summand that is the product of maximum values of limbs of left
and to_mul
. Those maximum values are stored as 256-bit unsigned integers. Their products could thus reach up to 512 bits in width, so four summands of that size could overflow when performing the addition in 512 bits. The function has no documentation detailing which bounds the caller needs to ensure.
The just discussed overflow could only happen in much more extreme and unlikely situations than those in Finding ref↗. There are many similar pieces of code through the bigfield
codebase of the same type, however.
Example for documenting bounds
We describe how one could specify assumptions for functions.
Consider the following example:
template <typename Builder, typename T>
uint512_t bigfield<Builder, T>::compute_maximum_quotient_value(const std::vector<uint512_t>& as, const std::vector<uint512_t>& bs, const std::vector<uint512_t>& to_add)
{
ASSERT(as.size() == bs.size());
uint512_t add_values(0);
for (const auto& add_element : to_add) {
add_values += add_element;
}
uint1024_t product_sum(0);
for (size_t i = 0; i < as.size(); i++) {
product_sum += uint1024_t(as[i]) * uint1024_t(bs[i]);
}
const uint1024_t add_right(add_values);
const uint1024_t modulus(target_basis.modulus);
const auto [quotient_1024, remainder_1024] = (product_sum + add_right).divmod(modulus);
return quotient_1024.lo;
}
This function takes as arguments various unsigned 512-bit integers. It then computes component-wise products of as
and bs
and adds all components up, together with the components of to_add
. Computation of the sum of components of to_add
is done in 512 bits, and the rest of the computation is done with a maximum bit width of 1,024 bits.
Already the addition of two components of to_add
could overflow in 512 bits, if at least one of them is 512 bits wide. So, to properly work, this function requires bounds on the components of to_add
. The most permissive bounds ruling out overflows would be the following:
sum(to_add) < 2^512
dot_product(as, bs) + sum(to_add) < 2^1024
However, requiring these kind of conditions on the caller might not be very user-friendly. It would be easier for the caller to think about bounds for individual arguments, and it seems likely that those can indeed be assumed. For example, all values passed as arguments will likely be at most the maximum value that could possibly be represented by a bigfield
element, obtained via bigfield::get_maximum_value()
. This would be the value we obtain if maximum_value
for all four binary limbs is the maximum value it can be, which is 2^256 - 1
.
Thus, we can estimate the maximum value that can be returned by bigfield::get_maximum_value()
to be less than 2^256 * (1 + 2^68 + 2^(2*68) + 2^(3*68)) < 2^(256+3*68+1) = 2^461
. A sum of values that are at most 461 bits wide can only overflow with 512 bits available if there are at least 2^51
summands. As passing a vector with that many summands would require petabytes of memory, we can rule this out as happening in practice. So instead of asking the caller to ensure sum(to_add) < 2^512
(which means having to make the kind of reasoning we just carried out each time one adds a new call to compute_maximum_quotient_value
elsewhere in the code), it is likely less cumbersome to request that all components of to_add
are at most 461 bits wide.
Similar reasoning can be carried out for the other relevant bounds for this function. Ultimately, we might end with the function documented (with respect to bounds, the function could also benefit from a documenting comment regarding what it is intended to do) as follows:
// Assumptions:
// 1. All components of `as`, `bs`, and `to_add` must be smaller than `1<<461`.
// Note that the above assumption will be satisfied for return values of `bigfield::get_maximum_value()`.
// Reason for that: The maximum value that can be returned is `(2^256 - 1) * (1 + 2^68 + 2^(2*68) + 2^(3*68))`, which is less than `1<<461`.
// Promises:
// - The return value will be less than `1<<820`.
template <typename Builder, typename T>
uint512_t bigfield<Builder, T>::compute_maximum_quotient_value(const std::vector<uint512_t>& as, const std::vector<uint512_t>& bs, const std::vector<uint512_t>& to_add)
{
ASSERT(as.size() == bs.size());
// The below loop calculates the sum of the components of `to_add` in the `uint512_t` type.
// An overflow would require at least `2^(512-461) = 2^51` summands, which we can reject as unrealistic.
uint512_t add_values(0);
for (const auto& add_element : to_add) {
add_values += add_element;
}
// Each summand of `product_sum` will be less than `2^(2*461) = 2^922`.
// Again, `2^100` or more summands are unrealistic, so this sum will not overflow, and will also stay less than `2^1023`.
uint1024_t product_sum(0);
for (size_t i = 0; i < as.size(); i++) {
product_sum += uint1024_t(as[i]) * uint1024_t(bs[i]);
}
const uint1024_t add_right(add_values);
const uint1024_t modulus(target_basis.modulus);
// The sum `product_sum + add_right` does not overflow, as `add_right` can be at most 512 bits wide, and we argued that `product_sum < 2^1023`.
const auto [quotient_1024, remainder_1024] = (product_sum + add_right).divmod(modulus);
// `modulus` must be bigger than or equal to `2^(3*68) = 2^204` by general assumption for this class.
// Thus, with the sum being divided being less than `2^1024`, we conclude that the quotient must be smaller than `2^(1024-204)=2^820`.
return quotient_1024.lo;
}
The above is merely an example for how one might thoroughly document reasoning about bounds. Different bounds may be desired for this function.