MMU-based epoch interruption#12990
Conversation
…and a test that shows it makes it through. TODO: There are other little corners where epoch_interruption comes up in the code where we should mention epoch_interruption_via_mmu_too.
…poch checks to loop headers as well. Cache the interrupt page ptr in a local for speed, as we did with the epoch deadline. Here is how I interpret the generated code in epoch-interruption-mmu.wat: ``` ;; @001B v2 = load.i64 notrap aligned readonly can_move v0+8 ;; Skip over magic number (4b) and alignment (another 4b). ;; @001B v3 = load.i64 notrap aligned v2+16 ;; Get interrupt page ptr. ;; @001B v4 = load.i32 aligned readonly v3 ;; Read from page ptr. ```
These are all just 8-byte offset increases from the addition of my epoch interrupt page ptr field to `VMStoreContext`. This script helped me show it:
```python
"""Compare runs of - and + blocks of a diff, and assert that the only
differences between them are differences in hex and decimal numbers therein.
Further, assert that those differences are a rise of 8, representing the size of
the field I added.
Output the diff with the proven-correct regions resolved in favor of the +
lines. Any remaining diff lines are suspicious and should be manually examined.
"""
import re
from sys import argv
def is_diff_line(s, plus_or_minus):
return bool(re.match(r"^ +" + "\\" + plus_or_minus, s))
def is_minus_line(s):
return is_diff_line(s, "-")
def is_plus_line(s):
return is_diff_line(s, "+")
def check_line_pairs(file_path):
with open(file_path, 'r') as file:
lines = file.readlines()
i = 0
while i < len(lines):
if is_minus_line(lines[i]):
minus_block = []
while i < len(lines) and is_minus_line(lines[i]):
minus_block.append(lines[i])
i += 1
plus_block = []
while i < len(lines) and is_plus_line(lines[i]):
plus_block.append(lines[i])
i += 1
if len(minus_block) != len(plus_block):
print(" + BLOCK LENGTHS DIFFERED.")
print("".join(minus_block))
print("".join(plus_block))
continue
# Compare the two blocks line by line
for line1, line2 in zip(minus_block, plus_block):
# Extract numbers (both decimal and hexadecimal) from both lines
numbers1 = [int(num, 16) if num.startswith("0x") else int(num)
for num in re.findall(r'0x[0-9a-fA-F]+|\d+', line1)]
numbers2 = [int(num, 16) if num.startswith("0x") else int(num)
for num in re.findall(r'0x[0-9a-fA-F]+|\d+', line2)]
# Check if the numbers differ by 0 or 8
if len(numbers1) == len(numbers2) and all(n2 - n1 in (0, 8) for n1, n2 in zip(numbers1, numbers2)):
# It's just an increment (or nothing), so keep the new line:
print(re.sub(r"^( +)\+", r"\1 ", line2), end="")
else:
print(line1, end="")
print(line2, end="")
else:
print(lines[i], end="")
i += 1
check_line_pairs(argv[1])
```
Now it is initted only if the `epoch-interruption-via-mmu` option is on. And, because the only instantiation of `VMStoreContext` is in the course of instantiating a `StoreOpaque`, a decent place to dispose of it is in `Drop for StoreOpaque`.
Keep the guts of the page-protecting operation on `VMStoreContext` next to where the page is mapped and unmapped.
…with_context` instruction. This will give us a convenient place to keep track of dead-load instruction locations and help us reserve the particular registers we need. * Add `mem_flags_aligned_read_only` helper so we can construct aligned-and-read-only `MemFlags`es in ISLE. * Add a `preg_rdi` constructor so we can refer to RDI in ISLE. TODO: Reserve a scratch register to hold the return address.
This gives us a place to put regalloc constraints and to gather metadata (specifically, instruction locations). Add a compile disas test to make sure `dead_load_with_context` is still emitting an acceptable dead load. It is. The only difference is that it's loading into `rdx` rather than `edx` now, probably due to my new regalloc constraints: ``` - movl (%rdx), %edx + movq (%rdx), %rdx ``` Also... * Make the new instruction a `.call()` in consistency with `stack_switch` being one. * Move the RDI-specificity to the regalloc constraints, which lets us remove the preg_rdi ISLE constructor I had added. * Reserve r10 as a scratch register.
Label Messager: wasmtime:configIt looks like you are changing Wasmtime's configuration options. Make sure to
DetailsTo modify this label's message, edit the To add new label messages or remove existing label messages, edit the |
This is an implementation of #1749, specifically @cfallin's roadmap, with the goal of reducing the overhead of checking for the end of epochs.
Paul ran some benchmarks on this (broadly agreeing with our real-world experiments) which tell us:
-Wepoch-interruption=yis a 14.4% hit versus doing nothing.The above numbers are from SpiderMonkey, which I deem the most representative benchmark.
Status:
DeadLoadWithContextCranelift instruction, and use it.DeadLoadWithContextemit metadata into compiled-artifact tables to tell the signal handler this is an interruption-point load.If the TLB shootdown arising from the frobbing of privs on the "interrupt page" proves too expensive, we can try a more indirect load instead, where, instead of messing with page privs, we mess with the address we're dead-loading from so it points to either a (statically) allowed or forbidden page. (Chris floated this idea at the 2026-04-08 Cranelift meeting.) Not many of the other mechanics need change.