Assessment reports>Initia>Critical findings>Unchecked UTF-8 decoding enables memory corruption
Category: Coding Mistakes

Unchecked UTF-8 decoding enables memory corruption

Critical Severity
High Impact
Medium Likelihood

Description

The following function fails to ensure the vector of strings is a well-formed UTF-8 before returning it:

public fun to_vector_string(v: vector<u8>): vector<String> {
    from_bytes<vector<String>>(v)
}

So, it is possible to obtain a String containing invalid UTF-8 sequences.

Impact

A malformed UTF-8 String may be able to cause an out-of-bounds (OOB) memory read, leading to a segmentation fault or potentially providing an attacker with an arbitrary read primitive — which would allow an attacker to break chain consensus by reading OOB, nondeterministic memory. Arbitrary read primitives may also be used in conjunction with other vulnerabilities to achieve arbitrary code execution.

Additionally, there is a possibility that the String may allow an attacker to overwrite bytes on the heap. Often, heap-corruption vulnerabilities can be directly exploited to achieve code execution; that is, in the worst case scenario, an attacker could create a malicious module that exploits this vulnerability to get remote code execution (RCE) on validators' machines.

Due to the limited time available for this assessment, we did not attempt to exploit this vulnerability beyond the following simple proof of concept, which demonstrates the ability to obtain a String containing invalid UTF-8 sequences:

#[test]
fun zellic_malformed_utf8_oob_read() {
    let invalid_utf8_vec = b"\x01\x01\x80";

    // obtain invalid String
    let res = to_vector_string(invalid_utf8_vec);
    assert!(!vector::is_empty(&res), 0);
    let s = vector::pop_back(&mut res);

    // trigger crash
    let needle = string::utf8(b"");
    string::index_of(&s, &needle);
}

The output of the test is the following:

thread '<unnamed>' panicked at library/core/src/panicking.rs:156:5:
unsafe precondition(s) violated: hint::unreachable_unchecked must never be reached
stack backtrace:
   0: rust_begin_unwind
         at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/
         panicking.rs:645:5
   1: core::panicking::panic_nounwind_fmt::runtime
         at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/
         panicking.rs:110:18
   2: core::panicking::panic_nounwind_fmt
         at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/
         panicking.rs:123:9
   3: core::panicking::panic_nounwind
         at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/
         panicking.rs:156:5
   4: core::hint::unreachable_unchecked::precondition_check
         at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/
         intrinsics.rs:2799:21
   5: core::hint::unreachable_unchecked
         at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/
         hint.rs:101:5
   6: core::option::Option<T>::unwrap_unchecked
         at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/
         option.rs:1041:30
   7: core::str::validations::next_code_point
...

Following the stack trace, we can identify the cause of the crash as being the following code:

pub const unsafe fn unwrap_unchecked(self) -> T {
    match self {
        Some(val) => val,
        // SAFETY: the safety contract must be upheld by the caller.
        None => unsafe { hint::unreachable_unchecked() },
    }
}

Recommendations

Ensure all Strings in the vector are valid before returning the vector:

public fun to_vector_string(v: vector<u8>): vector<String> {
-    from_bytes<vector<String>>(v)
+    let vec_string = from_bytes<vector<String>>(v);
+    vector::for_each_ref(&vec_string, |s| {
+        assert!(string::internal_check_utf8(string::bytes(s)), EINVALID_UTF8);
+    });
+    vec_string
}

Remediation

Zellic © 2025Back to top ↑