Cached at:
06/15/26, 12:50 AM
**TL;DR:** Nia Deckers implemented an FFI execution scheme for Miri using ptrace and SIGSEGV that achieves "8000 segfaults per second" — fork a process, use `PROT_NONE` to trigger segfaults, disassemble to determine access size, then hijack the instruction pointer to jump to a fake function, while using a mutex to resolve races and leveraging the X speaks disassembler for better cross-architecture support.
## Background: Miri's FFI Predicament
Miri is Rust's MIR interpreter that precisely tracks pointer provenance and memory initialization. But when calling FFI (Foreign Function Interface), the target code could be precompiled C/C++ or even Java libraries — Miri has no idea what memory operations it performs.
> "You basically have to hard-code the results."
It is impossible to track every detail, especially when linking an arbitrary `.so` file. However, one can come up with a "worst-case" assumption that is much better than the current approach of "assume it might do anything to memory." This means tracking every memory access: address, size, and read/write direction.
## The Solution: Ptrace + Segfaults
"I grew up in Eastern Europe, maybe a bit more conservative than many of you. So the first thing I thought of was to consult the sacred texts — Google results and Stack Overflow." She found a decade-old post: use SIGSEGV.
**Core idea**:
1. Fork Miri into two processes.
2. The "supervisor" process handles segfaults; the other process is Miri itself, jumping into FFI.
3. Before Miri enters FFI, set the target memory region to `PROT_NONE`. Any access triggers SIGSEGV.
4. The supervisor captures the segfault via ptrace and obtains the address that caused the fault.
5. But you also need the access size and read/write direction. How? Directly read the data at the instruction pointer of the other process, then feed it into a disassembler.
### Disassembly to Determine Access Details
"We're a Rust project team. We love parsing, verification, and such." She used the Capstone disassembler.
- Read the bytecode at the instruction pointer, disassemble to get the operand address size.
- If it's a write operation, mark the region as "initialized" (though it might not actually write everything, but what isn't written is definitely uninitialized — better than assuming all is initialized).
- It can also determine if pointer provenance is exposed (if a read comes from a pointer).
**Limitations**: Currently only supports x86. Other architectures (e.g., ARM) have scalable vector instructions whose size depends on register values, making implementation too painful. "I gave up. But this problem will be addressed."
### Recovering from Segfaults
Once the address and size are known, the program needs to continue. Modify registers via ptrace:
- Set the instruction pointer to a fake extern C function address.
- This fake function does not unprotect the current page; instead, it returns a `SIGSTOP` for the parent process to catch.
- "This isn't a calling convention. I don't know what this is. This is a crime against computing."
## Race Conditions and Fix: Mutex
Tests passed locally but always hung on CI. The reason was a lack of synchronization between Unix signals and IPC channel messages — a racy `SIGSTOP` would be received by another process before the message arrived, causing an eternal block.
"You see what I mean when I say this is like a psychological drama? I hate it. But I fixed it. I fixed a race condition by increasing parallelism." She used multiple seeded Miri instances (running with different RNG seeds simultaneously) and eventually reproduced the hang locally. Final solution: add a mutex around FFI. Only one process can execute FFI at a time. This also incidentally fixed a pre-existing potential bug in Miri: concurrent access to global variables by external code in a multithreaded context.
## Disassembler: From Capstone to X speaks
Capstone is a great disassembler, but what Nia needed wasn't disassembly per se — she needed a "magical machine that tells me the operand address size." Additionally, Capstone slowed down the check build. Later, thanks to Oxide's Ixy (X speaks project), she improved by harassing Ixy to add the needed functionality, making X speaks the better choice. "As far as I know, most of the issues have actually been fixed. Hopefully I can finally land this in Miri so we can get better cross-architecture support someday."
## Allocation Tracking: Shim libC Functions
Segfaults alone aren't enough to handle scenarios where a pointer is returned (e.g., `malloc`). Miri needs to know which memory is a valid allocation. She shimmed libffi/libC functions that can access memory (like `malloc`) — predefine a set of extern C shim functions, and during FFI calls, use ptrace to redirect the instruction pointer into these fake functions. The fake functions then call Miri's allocator, allowing Miri to track the newly allocated memory.
"I was told shimming the entire libC is overkill and completely unnecessary. Well, whatever you say."
## Summary: A Segfault-Driven Debugger
"The phrase '8000 segfaults per second' is really terrible. They'll be fixed eventually." Ultimately, this scheme not only allows Miri to execute arbitrary FFI but also accidentally provides a debugger — because you can precisely observe each memory access and crash site. Although it currently works only on Linux x86 and is full of "crimes against computing," it does the job.
**Source:** FFI in Miri at 8000 segfaults per second - RustWeek (https://youtu.be/9X-ngiKo_Y0)