Skip to content

bug: parallel subagent PreToolUse races PostToolUse in daemon queue, misattributing AI lines to human #994

@svarlamov

Description

@svarlamov

Summary

When multiple Claude subagents edit the same file in parallel, a race condition causes AI-written lines to be permanently attributed to the human author. The lines are completely absent from the final git note — they appear as a "human gap" in git-ai blame.

Reproduced from session 3c663e49-9c9f-4225-aea3-8efb93ab4471 in git-ai-project/git-ai, commit 89cdae17 (test: add comprehensive rebase attribution integration tests) — 12 lines in tests/integration/rebase_realworld.rs (603–614) are attributed to Sasha Varlamov despite being 100% AI-generated.

Root Cause

The bug lives in the interaction between agent_presets.rs and the async checkpoint daemon.

1. PreToolUse always creates a Human checkpoint (src/commands/checkpoint_agent/agent_presets.rs):

if hook_event_name == Some("PreToolUse") {
    return Ok(AgentRunResult {
        checkpoint_kind: CheckpointKind::Human,
        ...
    });
}

2. PreToolUse returns immediately (skips transcript parsing → ~19ms), while PostToolUse parses the full JSONL transcript (~37ms). With async_mode = true, both sends reach the daemon as FIFO queue entries — but Human PreToolUse beats AI PostToolUse to the daemon.

3. The daemon processes Human first. At the moment Human PreToolUse fires, Subagent A has already written its content to disk. The Human checkpoint captures that content as "new since last AI checkpoint" = human changes. When AI PostToolUse is processed next, the file hasn't changed since the Human checkpoint → diff is empty → the AI PostToolUse checkpoint never claims those lines.

Exact Sequence

Last AI checkpoint (#N): file state before parallel edits
Subagent A writes 12 lines to file
Subagent A PostToolUse fires  → git-ai captures blobs (~37ms, parsing transcript)
Subagent B PreToolUse fires   → git-ai captures blobs (~19ms, early return)
                                ↑ PreToolUse sends to daemon FIRST

Daemon processes PreToolUse (Human):
  diff(file, checkpoint #N) = +12 lines → attributed as Human
  → Human checkpoint #N+1 written with Subagent A's lines as "human"

Daemon processes PostToolUse (AI):
  diff(file, checkpoint #N+1) = no change (Human already consumed state)
  → AI checkpoint is empty → NOT written

Commit: lines are absent from git note → human by omission

Reproduction

#!/usr/bin/env bash
# Requires: git-ai installed, async_mode=true in ~/.git-ai/config.json
# Pass any valid Claude transcript as argument

TRANSCRIPT="${1:-$(ls ~/.claude/projects/*/*.jsonl 2>/dev/null | head -1)}"
REPO=$(mktemp -d /tmp/git-ai-race-XXXX)
cd "$REPO"
git init -q && git config user.name "Test User" && git config user.email "test@test.com"

cat > target.py << 'PYEOF'
def function_one():
    return [i * 2 for i in range(10)]

def function_two():
    return {"key1": "value1"}
PYEOF
git add target.py && git commit -q -m "initial"

checkpoint() {
    echo "{\"hook_event_name\":\"$1\",\"tool_name\":\"Write\",\"tool_input\":{\"file_path\":\"target.py\"},\"transcript_path\":\"$TRANSCRIPT\",\"cwd\":\"$REPO\"}" \
      | git-ai checkpoint claude --hook-input stdin 2>/dev/null
}

# Step 1: Establish prior AI attribution
cat >> target.py << 'PYEOF'


def function_three():
    """Written by AI - prior edit."""
    return [x * 3 for x in range(10)]
PYEOF
checkpoint "PostToolUse" "target.py"

# Step 2: Subagent A writes new AI content
cat >> target.py << 'PYEOF'


def function_four():
    """Written by Subagent A — will be misattributed."""
    return [x**2 for x in range(1, 6)]

def function_five():
    """Written by Subagent A — will be misattributed."""
    return {'a': 1, 'b': 2}
PYEOF

# Step 3: THE RACE — Human PreToolUse starts first (and wins the daemon queue)
checkpoint "PreToolUse" "target.py" &
sleep 0.05  # simulate PreToolUse's ~18ms head start over PostToolUse's transcript parsing
checkpoint "PostToolUse" "target.py" &
wait

sleep 3  # wait for daemon

# Step 4: Check attribution
git add target.py && git commit -q -m "feat: parallel subagent edits"
echo "=== Git note (only lines listed here are AI-attributed) ==="
git notes --ref=refs/notes/ai show HEAD | head -3
echo ""
echo "Expected: lines 9-18 (function_four + function_five) listed as AI"
echo "Actual:   only lines 9-14 (function_three) — function_four/five are MISSING = human"
git-ai stats HEAD 2>/dev/null | grep -v BENCHMARK | head -3

Observed output:

=== Git note ===
target.py
  36ee87f956a9e26f 9-14     ← only function_three, lines 9-14

Expected: lines 9-18 listed
Actual:   lines 15-18 absent = human

you  ████████████████████████████████████████ ai
     100%                                   0%

Affected Code

  • src/commands/checkpoint_agent/agent_presets.rsPreToolUse always returns CheckpointKind::Human
  • src/commands/git_ai_handlers.rsrun_checkpoint_via_daemon_or_local / captured checkpoint flow
  • The daemon's FIFO queue means capture ordering (which process captures first) determines attribution

Impact

Any session using the Agent tool to launch ≥2 parallel subagents editing the same file is vulnerable. The misattribution is permanent — no subsequent AI checkpoint can reclaim the lines because the Human checkpoint consumed the file state first.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions