Skip to content

A handful of optimizations for the DRC collector#12974

Merged
fitzgen merged 17 commits intobytecodealliance:mainfrom
fitzgen:drc-runtime-improvements
Apr 10, 2026
Merged

A handful of optimizations for the DRC collector#12974
fitzgen merged 17 commits intobytecodealliance:mainfrom
fitzgen:drc-runtime-improvements

Conversation

@fitzgen
Copy link
Copy Markdown
Member

@fitzgen fitzgen commented Apr 6, 2026

Depends on #12969

See each commit message for details.

More coming soon after this.

@fitzgen fitzgen requested review from a team as code owners April 6, 2026 19:47
@fitzgen fitzgen requested review from alexcrichton and removed request for a team April 6, 2026 19:47
@fitzgen fitzgen force-pushed the drc-runtime-improvements branch from 79013cf to 56a5b5a Compare April 6, 2026 20:24
@github-actions github-actions bot added wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:ref-types Issues related to reference types and GC in Wasmtime labels Apr 6, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 6, 2026

Subscribe to Label Action

cc @fitzgen

Details This issue or pull request has been labeled: "wasmtime:api", "wasmtime:ref-types"

Thus the following users have been cc'd because of the following labels:

  • fitzgen: wasmtime:ref-types

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Copy link
Copy Markdown
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to spend more time looking at Combine dec_ref, trace, and dealloc into single-pass loop but this is one thing I noticed. The later commits seem fine though.

This is another case though where in-wasm GC allocation, GC mark/sweep, etc, would I suspect remove a huge amount of the overhead since the host has to dance around "the heap could be corrupt at any time" which loses a lot of perf I believe. I realize that's a big undertaking, but we may want to discuss more seriously in a meeting at some point if it's table stakes or not for shipping gc.

@fitzgen
Copy link
Copy Markdown
Member Author

fitzgen commented Apr 7, 2026

I realize that's a big undertaking, but we may want to discuss more seriously in a meeting at some point if it's table stakes or not for shipping gc.

Happy to discuss at a meeting, I'll add an item, but I find it super surprising that we would even entertain the idea of blocking enabling the GC proposal by default on self-hosting the free list (or even worse from a time-to-shipping perspective: self-hosting the whole collector runtime).

Copy link
Copy Markdown
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fitzgen @cfallin and I talked a bit more about GC things today, and we'll summarize our thinking at tomorrow's Wasmtime meeting as well.

@fitzgen fitzgen force-pushed the drc-runtime-improvements branch from 56a5b5a to 569278a Compare April 8, 2026 23:44
@fitzgen fitzgen enabled auto-merge April 8, 2026 23:45
@fitzgen fitzgen added this pull request to the merge queue Apr 8, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 9, 2026
@fitzgen fitzgen enabled auto-merge April 9, 2026 14:35
@fitzgen fitzgen added this pull request to the merge queue Apr 9, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 9, 2026
@fitzgen fitzgen enabled auto-merge April 9, 2026 18:49
@fitzgen fitzgen added this pull request to the merge queue Apr 9, 2026
@alexcrichton alexcrichton removed this pull request from the merge queue due to a manual request Apr 9, 2026
@alexcrichton
Copy link
Copy Markdown
Member

I pulled this out of the queue manually due to the failure at https://github.com/bytecodealliance/wasmtime/actions/runs/24208166925/job/70669519553

@fitzgen fitzgen enabled auto-merge April 9, 2026 21:05
@fitzgen fitzgen added this pull request to the merge queue Apr 9, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 9, 2026
@fitzgen fitzgen enabled auto-merge April 10, 2026 19:18
fitzgen added 4 commits April 10, 2026 13:23
Ideally we would just use a `SecondaryMap<VMSharedTypeIndex, TraceInfo>` here
but allocating `O(num engine types)` space inside a store that uses only a
couple types seems not great. So instead, we just have a fixed size cache that
is probably big enough for most things in practice.
Inline `dec_ref`, `trace_gc_ref`, and `dealloc` into
`dec_ref_and_maybe_dealloc`'s main loop so that we read the `VMDrcHeader` once
per object to get `ref_count`, type index, and `object_size`, avoiding 3
separate GC heap accesses and bounds checks per freed object.

For struct tracing, read gc_ref fields directly from the heap slice at known
offsets instead of going through gc_object_data → object_range → object_size
which would re-read the object_size from the header.

301,333,979,721 -> 291,038,676,119 instructions (~3.4% improvement)
…exists

When the GC store is already initialized and the allocation succeeds, avoid
async machinery entirely. This avoids the overhead of taking/restoring fiber
async state pointers on every allocation.

291,038,676,119 -> 230,503,364,489 instructions (~20.8% improvement)
Avoids converting `ModuleInternedTypeIndex` to `VMSharedTypeIndex` in host code,
which requires look ups in the instance's module's `TypeCollection`. We already
have helpers to do this conversion inline in JIT code.

230,503,364,489 -> 216,937,168,529 instructions (~5.9% improvement)
fitzgen added 12 commits April 10, 2026 13:23
Moves the `externref` host data cleanup inside the `ty.is_none()` branch of
`dec_ref_and_maybe_dealloc`, since only `externref`s have host
data. Additionally the type check is sort of expensive since it involves
additional bounds-checked reads from the GC heap.
This reverts commit 41dcbd931170c0e510b5baf9e0cafa19a83c0ddd.
`Layout::from_size_align` rejects sizes greater than `isize::MAX`, causing
`add_capacity` to silently discard new capacity blocks that exceed this
limit. This meant the free list could not grow beyond ~2 GB on 32-bit even
though our `u32` indices can address up to ~4 GB.

Fix by calling `dealloc_impl` directly in add_capacity, bypassing the `Layout`
construction. The block index and size are already properly aligned u32 values,
so the `Layout` validation is unnecessary for internal free list bookkeeping.

Also remove a redundant `debug_assert` in `dealloc_impl` that constructed a
`Layout` (hitting the same `isize::MAX` limitation), since the alignment
invariant is already checked by the adjacent assertions.
@fitzgen fitzgen force-pushed the drc-runtime-improvements branch from 3f251f3 to cae3837 Compare April 10, 2026 20:27
@fitzgen fitzgen added this pull request to the merge queue Apr 10, 2026
Merged via the queue into bytecodealliance:main with commit e920961 Apr 10, 2026
48 checks passed
@fitzgen fitzgen deleted the drc-runtime-improvements branch April 10, 2026 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:ref-types Issues related to reference types and GC in Wasmtime

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants