Add opt-in Ractor-safe InstanceRegistry via RICE_RACTOR_SAFE#401
Add opt-in Ractor-safe InstanceRegistry via RICE_RACTOR_SAFE#401javier-sy wants to merge 4 commits intoruby-rice:masterfrom
Conversation
InstanceRegistry is the only Rice registry that performs writes at runtime (lookup, add, remove, clear). The other registries (TypeRegistry, NativeRegistry, HandlerRegistry, ModuleRegistry) are written once during Init and are read-only at runtime. When RICE_RACTOR_SAFE is defined, a std::recursive_mutex protects all mutable InstanceRegistry operations. This allows C extensions built with Rice to be used safely from multiple Ruby Ractors. Without the define, the generated code is identical to the current Rice — zero impact on existing projects. Usage in extconf.rb: $CXXFLAGS << " -DRICE_RACTOR_SAFE"
A minimal Rice extension (Counter class) compiled with -DRICE_RACTOR_SAFE, plus 5 Minitest tests verifying: - Single Ractor creating and using Rice-wrapped objects - Sequential Ractors with rapid handoff - Concurrent Ractors (would segfault without the mutex) - Many objects in a Ractor (stresses InstanceRegistry add/lookup) - Main thread and Ractor using Rice simultaneously The tests auto-build the extension on first run. Located in test/ruby/ matching the existing Rake::TestTask glob.
A second test extension compiled WITHOUT -DRICE_RACTOR_SAFE proves that concurrent Ractor access to Rice-wrapped objects corrupts the unprotected InstanceRegistry (std::map). The corruption manifests as a hang (infinite loop in the corrupted red-black tree), not always as a segfault. The test runs in a subprocess with a 30s timeout — hang or crash both confirm the bug. This is the counterpart to test_ractor.rb which proves the same workload succeeds WITH RICE_RACTOR_SAFE.
- docs/why_rice.md: expand Thread Safety section with Ractor support subsection. Clarifies that without the define behavior is unchanged, documents the extconf.rb usage, and notes Ruby 3.4.x compatibility (Ruby 4.x pending validation). - CHANGELOG.md: add unreleased entry for the feature. The #include <mutex> is conditional (#ifdef RICE_RACTOR_SAFE) inside InstanceRegistry.hpp itself — no changes to rice.hpp, respecting the project's include management policy.
|
Just for more information: if the PR is accepted my next step is to ask for a PR I have already prepared and tested locally for or-tools-ruby using the new version of rice. If it's not accepted I can make a custom code adaptor for my counterpoint engine gem so it's not a stop for my project but I think the solution could be helpful for more users than me. Thank you again! |
|
Hi @javier-sy - thanks for the very thoughtful PR. I didn't actually realize Rice had that language about threading in the why rice document, that is ancient. I think I'll just remove it. However, Rice today is likely not thread safe. But it should be! The simplest solution is you can just turn off the instance registry - it is a global setting. If we want to use it, I think the patch would stop crashes but doesn't fully fix the problem. If two Ractors wrap the same C++ pointer concurrently they will both get Qnil from lookup, both allocate a Wrapper, and both add new instances. The second would silently overwrites the first, leaking a Wrapper and leaving a dangling Ruby VALUE. We would need to add a lookup_or_insert method or some such? Or allow the mutex to be held across lookup and insert, but that would reuqire exposing it to calling code. I also wonder if the complexity of RICE_RACTOR_SAFE is worth the bother. I don't know how much overhead the mutex will cause, but I'd lean towards we have them or we don't, but its not a configurable setting. Anyway, if you are good with just turning off the registry problem solved. If you want it enabled, then I think we will need to think more about the best way to protect IntanceRegistry (and probably other part of the code base also). |
Hi there,
I'd like to propose an opt-in mechanism for Ractor safety in Rice. I understand this touches a sensitive area — the docs explicitly state "Rice provides no mechanisms for dealing with thread safety" — so I want to explain the context that led here, and why I believe this is a minimal, safe change worth considering.
Motivation
I'm the author of MusaDSL, an algorithmic music composition framework for Ruby. I'm currently building a new module — a real-time counterpoint engine that generates multi-voice polyphony via constraint satisfaction. The engine uses or-tools-ruby (Google's OR-Tools CP-SAT solver) for finding valid voice-leading solutions, and or-tools-ruby is built on Rice. For real-time performance, the solver needs to run inside a Ractor so it doesn't block the sequencer thread — which is how I encountered the InstanceRegistry thread-safety issue.
I got it working by marking the extension as Ractor-safe (
rb_ext_ractor_safe(true)), but concurrent access from multiple Ractors caused segfaults.After investigation, I traced the issue to a single point: Rice's
InstanceRegistry— thestd::mapthat tracks C++ object wrappers. It's the only Rice registry that performs writes at runtime (add,lookup,remove). The other registries (TypeRegistry,NativeRegistry,HandlerRegistry,ModuleRegistry) are all written once duringInit_*and are read-only afterward.The change
When
RICE_RACTOR_SAFEis defined at compile time, astd::recursive_mutexis added toInstanceRegistry, protecting all four mutable operations. Two files changed, 21 lines added:rice/detail/InstanceRegistry.hpp— conditional#include <mutex>+ mutex memberrice/detail/InstanceRegistry.ipp—std::lock_guardinlookup,add,remove,clearWithout the define, the generated code is identical to the current Rice — no mutex, no overhead, no behavior change. Existing projects are completely unaffected.
Consumers opt in via their
extconf.rb:$CXXFLAGS << " -DRICE_RACTOR_SAFE"Tests
The PR includes two Minitest test suites in
test/ruby/(matching the existingRake::TestTaskglob):Positive tests (
test_ractor.rb) — a minimal Rice extension compiled with-DRICE_RACTOR_SAFE:Counter.newcalls calibrated to fill ~1s, 2 Ractors creating that many objects simultaneouslyNegative test (
test_ractor_unsafe.rb) — same extension compiled WITHOUT the define:std::mapmanifests as an infinite loop in the red-black tree, not always as a segfaultBoth calibration strategies ensure the tests produce meaningful contention on any machine, fast or slow.
Why not a mutex by default?
I considered making the mutex unconditional, but:
The opt-in approach lets Ractor-aware projects enable it explicitly while keeping the default behavior unchanged.
Compatibility
Tested with Ruby 3.4.x (where Ractors are experimental). Ruby 4.x compatibility is pending validation — the Ractor API may change, but the mutex mechanism itself is pure C++ and should remain valid.
Note on the
#includeI'm aware of the project policy about not adding
#includedirectives to arbitrary files. The#include <mutex>is insideInstanceRegistry.hppitself (not inrice.hpp), and is conditional on#ifdef RICE_RACTOR_SAFE. I believe this is the least invasive placement, but I'm happy to move it if you prefer a different approach.I realize this is a change in Rice's stance on thread safety, and I completely understand if you'd prefer to approach it differently. I'm open to any feedback — whether that's changes to the implementation, a different API surface, or even just keeping this in a fork until Ractors stabilize. The important thing for me was to identify the exact problem (InstanceRegistry is the only runtime-mutable registry) and demonstrate that a minimal fix works.
Thank you for maintaining Rice — it's a remarkable piece of engineering that makes the Ruby/C++ boundary feel natural.