Assessment reports>SAX>Medium findings>The ,_toLower, function incorrectly handles the Unicode characters
Category: Coding Mistakes

The _toLower function incorrectly handles the Unicode characters

Medium Severity
Medium Impact
Medium Likelihood

Description

The internal _toLower function converts the hashtag string from uppercase to lowercase. This function accurately processes only ASCII characters in the range of A to Z. However, the input string can contain Unicode symbols, where the character set is significantly broader and includes not only the Latin alphabet. So in the case where an input string contains characters outside the specified range, the function will not process them as expected and the characters will remain unchanged.

function _toLower(string memory str) internal pure returns (string memory) {
    bytes memory bStr = bytes(str);
    bytes memory bLower = new bytes(bStr.length);
    for (uint i = 0; i < bStr.length; i++) {
        // Uppercase character...
        if ((uint8(bStr[i]) >= 65) && (uint8(bStr[i]) <= 90)) {
            // So we add 32 to make it lowercase
            bLower[i] = bytes1(uint8(bStr[i]) + 32);
        } else {
            bLower[i] = bStr[i];
        }
    }
    return string(bLower);
}

Impact

If the input hashtag string contains non-Latin characters, they will not be reduced to lowercase and will remain unchanged after processing with the _toLower function. Therefore, other functionality of this contract can be violated by the incorrect string handling.

Recommendations

Consider limiting the input character set to only the required ASCII characters if the project does not assume that users can use the extended Unicode character set.

Remediation

This issue has been acknowledged by SAX, and a fix was implemented in commit dfb4e3f1.

Zellic © 2024Back to top ↑