Assessment reports>Babylon Chain>Critical findings>Finality provider can crash when submitting signature on finalized block
Category: Coding Mistakes

Finality provider can crash when submitting signature on finalized block

Critical Severity
Medium Impact
Low Likelihood

Description

The finality-provider is a tool run by all finality providers. It automatically fetches new Babylon blocks and commits a randomness number to these blocks before providing finality signatures.

The finality-signature submission occurs through the finalitySigSubmissionLoop(). This function, on a high level, does the following:

  1. Ensures the finality provider has voting power at the current block height.

  2. Waits until the randomnessCommitmentLoop() function has committed a randomness number to the current block or the block has already achieved finality due to other finality providers providing enough signatures.

  3. Tries to submit a finality signature for the block — this keeps being retried over an interval.

In step 2, it uses the retryCheckRandomnessUntilBlockFinalized(). The important thing to note are the two exit conditions:

  • Exit-condition 1 — The block's randomness number was committed.

  • Exit-condition 2 — The block was already finalized.

In both of these cases, the function returns nil, signifying that the finalitySigSubmissionLoop() should continue to submit a finality signature. This is obviously not correct if exit-condition 2 was the reason for exiting the retryCheckRandomnessUntilBlockFinalized() function.

The issue now (with exit-condition 2) is that when the finality provider attempts to submit a finality signature, the block's randomness number likely is not committed at all. In this scenario, the AddFinalitySig() message handler in Babylon will return the ErrPubRandNotFound error:

// ensure the finality provider has committed public randomness
pubRand, err := ms.GetPubRand(ctx, fpPK, req.BlockHeight)
if err != nil {
	return nil, types.ErrPubRandNotFound
}

This error, in turn, is treated as one of many unrecoverable errors on the finality provider:

var unrecoverableErrors = []*sdkErr.Error{
	finalitytypes.ErrBlockNotFound,
	finalitytypes.ErrInvalidFinalitySig,
	finalitytypes.ErrNoPubRandYet,
	finalitytypes.ErrPubRandNotFound,
	finalitytypes.ErrTooFewPubRand,
	btcstakingtypes.ErrFpAlreadySlashed,
}

Therefore, when the finality provider attempts to submit a finality signature for this block, it will get back an ErrPubRandNotFound and subsequently exit.

Impact

If enough finality providers run into this issue, the chain will be left in a state where block finality can never reach quorum, thus leading to a finality halt.

The finality halting itself is critical in nature. However, due to the following criteria, we think this bug has a medium level of impact:

  1. There is no attacker. The bug will trigger by itself under certain conditions, thus having a low likelihood.

  2. The finality providers can just restart to fix the issue.

Recommendations

Add logic to retryCheckRandomnessUntilBlockFinalized() such that it returns something different if exit-condition 2 was the reason for exiting the function (i.e., the block was finalized). Then, the finalitySigSubmissionLoop() function can skip the block.

Remediation

This issue has been acknowledged by Babylon, and a fix was implemented in commit 9fe04d26.

Zellic © 2025Back to top ↑