debug: hang flake instrumentation (do not merge)#5
Closed
Conversation
DO NOT MERGE — temporary diagnostics on a debug branch. Logs CANCEL, CANCEL_KILLED, TRANSITION_TO_WAITING, WAIT_BEGIN, WAIT_RESULT events with run id, cid, kill response and exit code to /tmp/fbi-debug.log; CI uploads it as the fbi-debug-log artifact on every run. Drops retries to 0 so the flake surfaces. Once we have the data and a fix, revert this branch.
Adds /fbi-state/quantico-debug.log marker for which branch quantico took (Exited/SleepingForever/Err) and uploads per-run state dirs so we can correlate quantico's outcome with the actor's TRANSITION_TO_WAITING decision.
Root cause of the hang-test 'succeeded' flake: SQLite reuses run rowids
when a prior run is deleted (no AUTOINCREMENT). When test N+1 lands on
an id whose container from test N is still alive (Waiting phase, listener
still connected), the container's bind mount on runs_dir/<id>/state is
still active. setup_run_dir's del_dir_r then fails with EBUSY, the
directory survives, and result.json from the prior run leaks into the
new run. read_outcome reads the stale exit_code; if the prior scenario
exited 0, the new run is reported 'succeeded' regardless of what
actually happened in its container.
Two-part fix:
1. Reorder do_mock_launch / do_real_launch to force-remove the prior
container BEFORE setup_run_dir, so the bind mount is released and
del_dir_r can clean the directory.
2. As a defensive layer, explicitly delete the individual state signal
files (result.json, agent-status, session-id, ready) by path. unlink
works on individual files inside a bind-mounted directory even when
the directory itself can't be removed — so even if a future bug
re-introduces the ordering issue, read_outcome won't see stale data.
Verified via /tmp/fbi-runs-state artifacts on debug/hang-flake CI run
25166517481: run-3's quantico-debug.log said OUTCOME=SleepingForever
(correctly blocked) yet result.json had exit_code:1 — leftover from
the crash-fast run that previously held that id.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Temporary diagnostic branch — do not merge.