Skip to content

fix(runtime): correlate ProcessIO responses by top-level id#195

Merged
bbopen merged 1 commit intomainfrom
codex/fix-processio-id-correlation
Feb 28, 2026
Merged

fix(runtime): correlate ProcessIO responses by top-level id#195
bbopen merged 1 commit intomainfrom
codex/fix-processio-id-correlation

Conversation

@bbopen
Copy link
Owner

@bbopen bbopen commented Feb 28, 2026

Summary\n- replace regex ID extraction with strict top-level JSON id parsing in ProcessIO\n- treat id=0 as a valid request/response correlation key\n- add regressions for nested payload id values and id=0\n\n## Validation\n- npx vitest --run test/transport.test.ts test/runtime_node.test.ts

@coderabbitai
Copy link

coderabbitai bot commented Feb 28, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6245e95 and 5eeaf30.

📒 Files selected for processing (3)
  • src/runtime/process-io.ts
  • test/runtime_node.test.ts
  • test/transport.test.ts
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: os (windows-latest)
🧰 Additional context used
🧠 Learnings (12)
📓 Common learnings
Learnt from: bbopen
Repo: bbopen/tywrap PR: 152
File: docs/adr/002-bridge-protocol.md:636-637
Timestamp: 2026-01-20T16:01:29.276Z
Learning: In the tywrap BridgeProtocol ProcessIO transport (ADR-002), when implementing `restartAfterRequests` functionality, use drain-before-restart as the default: wait for `pendingRequests.size === 0` before killing and restarting the process. This prevents rejecting in-flight requests during restart. Alternative graceful restart (spawn new process, drain old) adds complexity with two concurrent processes.
Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: src/runtime/bridge-core.ts:260-263
Timestamp: 2026-01-19T21:14:40.872Z
Learning: In `src/runtime/bridge-core.ts` and similar hot request/response loop implementations in the tywrap repository, avoid adding extra defensive validation (e.g., runtime shape checks on error payloads) in tight loops unless the protocol boundary is untrusted or there's a concrete bug report. The Python bridge protocol is controlled and validated via tests, so defensive checks would add unnecessary branching overhead without meaningful benefit.
Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: src/runtime/bridge-core.ts:260-298
Timestamp: 2026-01-19T21:48:45.693Z
Learning: In `src/runtime/bridge-core.ts`, keep `normalizeErrorPayload` to validate error payloads from the Python subprocess. The subprocess boundary is effectively untrusted, and normalizing error responses prevents `undefined: undefined` errors on malformed payloads. Error responses are not the hot path, so the small validation overhead is acceptable for the added resilience.
📚 Learning: 2026-01-20T16:01:14.323Z
Learnt from: bbopen
Repo: bbopen/tywrap PR: 152
File: docs/adr/002-bridge-protocol.md:599-602
Timestamp: 2026-01-20T16:01:14.323Z
Learning: In `src/runtime/node-bridge.ts` (NodeBridge), a test message is sent to the Python subprocess to confirm the bridge is responsive before marking it as ready.

Applied to files:

  • test/transport.test.ts
  • test/runtime_node.test.ts
📚 Learning: 2026-01-19T21:00:52.689Z
Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: test/fixtures/out_of_order_bridge.py:29-48
Timestamp: 2026-01-19T21:00:52.689Z
Learning: In `test/fixtures/out_of_order_bridge.py`, the fixture intentionally leaves a pending request unanswered at EOF to simulate missing/out-of-order responses and validate bridge behavior when requests never complete; this is the exact failure mode being tested and must be preserved.

Applied to files:

  • test/transport.test.ts
📚 Learning: 2026-01-19T21:00:30.825Z
Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: test/adversarial_playground.test.ts:339-360
Timestamp: 2026-01-19T21:00:30.825Z
Learning: In adversarial tests for the bridge (test/adversarial_playground.test.ts), recovery assertions should be kept in the finally block adjacent to cleanup to verify resilience under stress, even if that means post-timeout checks might mask the original error. The goal is to surface recovery failures as legitimate test failures.

Applied to files:

  • test/transport.test.ts
  • test/runtime_node.test.ts
📚 Learning: 2026-01-19T21:48:45.693Z
Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: src/runtime/bridge-core.ts:260-298
Timestamp: 2026-01-19T21:48:45.693Z
Learning: In `src/runtime/bridge-core.ts`, keep `normalizeErrorPayload` to validate error payloads from the Python subprocess. The subprocess boundary is effectively untrusted, and normalizing error responses prevents `undefined: undefined` errors on malformed payloads. Error responses are not the hot path, so the small validation overhead is acceptable for the added resilience.

Applied to files:

  • test/transport.test.ts
  • test/runtime_node.test.ts
  • src/runtime/process-io.ts
📚 Learning: 2026-01-19T21:48:27.823Z
Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: test/runtime_bridge_fixtures.test.ts:59-66
Timestamp: 2026-01-19T21:48:27.823Z
Learning: In fixture-based tests (e.g., test/runtime_bridge_fixtures.test.ts) and similar tests in the tywrap repository, prefer early returns when Python or fixture files are unavailable. Do not rely on Vitest dynamic skip APIs; a silent pass is intentional for environments lacking Python/fixtures. Treat missing fixtures as optional soft-guards and ensure the test remains non-disruptive in non-availability scenarios.

Applied to files:

  • test/transport.test.ts
  • test/runtime_node.test.ts
📚 Learning: 2026-01-20T01:34:07.064Z
Learnt from: bbopen
Repo: bbopen/tywrap PR: 136
File: src/runtime/node.ts:444-458
Timestamp: 2026-01-20T01:34:07.064Z
Learning: When reviewing promise-based polling patterns (e.g., recursive setTimeout in waitForAvailableWorker functions), ensure that any recursive timer is tracked and cleared on timeout or promise resolution to prevent timer leaks. Do not rely on no-op checks after settlement; verify clearTimeout is invoked in all paths and consider using an explicit cancellation (e.g., AbortController) or a guard to cancel pending timers when the promise settles.

Applied to files:

  • test/transport.test.ts
  • test/runtime_node.test.ts
  • src/runtime/process-io.ts
📚 Learning: 2026-01-20T18:37:05.670Z
Learnt from: bbopen
Repo: bbopen/tywrap PR: 152
File: src/runtime/process-io.ts:37-40
Timestamp: 2026-01-20T18:37:05.670Z
Learning: In the tywrap repo, ESLint is used for linting (not Biome). Do not flag regex literals that contain control characters (e.g., \u001b, \u0000-\u001F) as lint errors, since the current ESLint configuration allows them. When reviewing TypeScript files (e.g., src/**/*.ts), rely on ESLint results and avoid raising issues about these specific control-character regex literals unless there is a competing config change or a policy requiring explicit escaping.

Applied to files:

  • test/transport.test.ts
  • test/runtime_node.test.ts
  • src/runtime/process-io.ts
📚 Learning: 2026-01-20T16:00:49.829Z
Learnt from: bbopen
Repo: bbopen/tywrap PR: 152
File: docs/adr/002-bridge-protocol.md:203-211
Timestamp: 2026-01-20T16:00:49.829Z
Learning: In the BridgeProtocol implementation (tywrap), reject Map and Set explicitly before serialization (e.g., in safeStringify or the serialization entrypoint) with a clear error like "Bridge protocol does not support Map/Set values". Do not rely on post-hoc checks of non-string keys at the point of JSON.stringify, since Maps/Sets cannot round-trip. This should be enforced at the boundary where data is prepared for serialization to ensure deterministic errors and prevent invalid data from propagating.

Applied to files:

  • test/transport.test.ts
  • test/runtime_node.test.ts
  • src/runtime/process-io.ts
📚 Learning: 2026-01-19T21:14:40.872Z
Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: src/runtime/bridge-core.ts:260-263
Timestamp: 2026-01-19T21:14:40.872Z
Learning: In `src/runtime/bridge-core.ts` and similar hot request/response loop implementations in the tywrap repository, avoid adding extra defensive validation (e.g., runtime shape checks on error payloads) in tight loops unless the protocol boundary is untrusted or there's a concrete bug report. The Python bridge protocol is controlled and validated via tests, so defensive checks would add unnecessary branching overhead without meaningful benefit.

Applied to files:

  • test/runtime_node.test.ts
📚 Learning: 2026-01-19T21:14:29.869Z
Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: src/runtime/bridge-core.ts:375-385
Timestamp: 2026-01-19T21:14:29.869Z
Learning: In the runtime env-var parsing (e.g., in src/runtime/bridge-core.ts and similar modules), adopt a tolerant, best-effort policy: numeric env values should parse by taking the leading numeric portion (e.g., TYWRAP_CODEC_MAX_BYTES=1024abc -> 1024). Only reject clearly invalid values (non-numeric start or <= 0). This reduces surprising failures from minor typos. Add tests to cover partial numeric prefixes, and ensure downstream logic documents the trusted range; consider a small helper to extract a positive integer from a string with a safe fallback.

Applied to files:

  • src/runtime/process-io.ts
📚 Learning: 2026-01-19T21:14:35.390Z
Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: src/runtime/bridge-core.ts:260-263
Timestamp: 2026-01-19T21:14:35.390Z
Learning: In src/runtime/bridge-core.ts and similar hot request/response loop implementations in the tywrap repository, avoid adding extra defensive validation (such as runtime, shape, or error-payload checks) inside tight loops unless the protocol boundary is untrusted or there is a concrete bug report. The Python bridge protocol is controlled via tests, so these checks add unnecessary branching overhead without meaningful benefit. Apply this guidance to other hot-path runtime loops in src/runtime/**/*.ts, and re-enable additional validations only when a documented risk or failure scenario is identified. Ensure tests cover protocol validation where applicable.

Applied to files:

  • src/runtime/process-io.ts
🧬 Code graph analysis (1)
test/transport.test.ts (1)
src/runtime/process-io.ts (1)
  • ProcessIO (161-916)
🔇 Additional comments (5)
test/runtime_node.test.ts (1)

115-126: Good regression coverage for nested id payloads.

This test cleanly reproduces the prior failure mode and confirms no timeout when nested id fields are present in args.

test/transport.test.ts (2)

592-620: Top-level id correlation test is well targeted.

This directly validates the intended correlation rule and guards against nested id false matches.


622-639: Nice edge-case coverage for id = 0.

This is the exact regression guard needed for falsy-id handling.

src/runtime/process-io.ts (2)

109-130: extractMessageId rewrite is correct and robust.

Top-level JSON parsing with integer validation cleanly prevents nested-field mis-correlation.


244-246: Null-check updates correctly fix falsy-id handling.

Using === null in both send and response paths keeps id = 0 valid and makes correlation behavior consistent.

Also applies to: 592-594


📝 Walkthrough

Walkthrough

The pull request modifies the message ID extraction logic in the runtime process-io module, replacing regex-based parsing with JSON.parse-based validation that ensures the id field is a numeric integer. Call sites are updated to use explicit null checks instead of truthiness checks, with corresponding test cases validating the new behavior.

Changes

Cohort / File(s) Summary
Message ID Extraction Logic
src/runtime/process-io.ts
Replaced regex-based message ID extraction with JSON.parse validation that ensures id is a numeric integer, returns null for invalid input, and updated all call sites to use explicit null checks instead of truthiness checks.
Test Coverage
test/runtime_node.test.ts, test/transport.test.ts
Added tests verifying correct behavior with nested id fields and edge cases: request/response correlation uses top-level id, response with id 0 participates in correlation, and nested id fields in args don't interfere with message routing.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

bug, area:runtime-node, priority:p2

Poem

A rabbit hops through parsing code,
Where JSON paths now brightly glow,
No regex traps along the way,
Just null checks keeping strays at bay,
Message IDs find their home today! 🐰✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly identifies the main change: fixing ProcessIO response correlation to use the top-level id field, which aligns with the core modifications in the codebase.
Description check ✅ Passed The description accurately outlines the three key changes: replacing regex ID extraction with JSON parsing, treating id=0 as valid, and adding regression tests, all of which are reflected in the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch codex/fix-processio-id-correlation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai bot added bug Something isn't working area:runtime-node Area: Node runtime bridge priority:p2 Priority P2 (medium) labels Feb 28, 2026
@bbopen bbopen merged commit 38406b4 into main Feb 28, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:runtime-node Area: Node runtime bridge bug Something isn't working priority:p2 Priority P2 (medium)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant