Zellic - Audit Reports

Category: Coding Mistakes

The `_toLower` function incorrectly handles the Unicode characters

Medium Severity

Medium Impact

Medium Likelihood

Description

The internal _toLower function converts the hashtag string from uppercase to lowercase. This function accurately processes only ASCII characters in the range of A to Z. However, the input string can contain Unicode symbols, where the character set is significantly broader and includes not only the Latin alphabet. So in the case where an input string contains characters outside the specified range, the function will not process them as expected and the characters will remain unchanged.

function _toLower(string memory str) internal pure returns (string memory) {
    bytes memory bStr = bytes(str);
    bytes memory bLower = new bytes(bStr.length);
    for (uint i = 0; i < bStr.length; i++) {
        // Uppercase character...
        if ((uint8(bStr[i]) >= 65) && (uint8(bStr[i]) <= 90)) {
            // So we add 32 to make it lowercase
            bLower[i] = bytes1(uint8(bStr[i]) + 32);
        } else {
            bLower[i] = bStr[i];
        }
    }
    return string(bLower);
}

Impact

If the input hashtag string contains non-Latin characters, they will not be reduced to lowercase and will remain unchanged after processing with the _toLower function. Therefore, other functionality of this contract can be violated by the incorrect string handling.

Recommendations

Consider limiting the input character set to only the required ASCII characters if the project does not assume that users can use the extended Unicode character set.

Remediation

This issue has been acknowledged by SAX, and a fix was implemented in commit dfb4e3f1↗.

The _toLower function incorrectly handles the Unicode characters

Description

Impact

Recommendations

Remediation

The `_toLower` function incorrectly handles the Unicode characters