Summary
During rhiza sync (3-way merge), a diverged target ends up with both git conflict markers (unmerged index stages) and duplicate *.rej files for the same hunks. The .rej files are spurious — the hunks are already represented as conflict markers / index entries — but they make downstream resolution fragile and noisy.
Observed in rhiza-hooks syncing v0.18.8 → v0.19.3: 14 files left with <<<<<<< / >>>>>>> markers (and git ls-files -u index stages) and 33 *.rej files, all of which were verified spurious (already applied, or byte-identical to the upstream bundle source).
Root cause
rhiza/models/_git_utils.py::_apply_diff (v0.17.6), lines ~356–395:
try:
subprocess.run([git, "apply", "-3"], input=diff, check=True, ...)
except CalledProcessError as e:
stderr = ...
if "lacks the necessary blob" in stderr and base_snapshot and upstream_snapshot:
return self._merge_file_fallback(...) # markers only — OK
if stderr:
logger.warning(...)
# Fall back to --reject for conflict files
subprocess.run([git, "apply", "--reject"], input=diff, check=True, ...) # <-- re-applies the WHOLE diff
return False
When git apply -3 does have the blobs, it performs a real 3-way merge: it applies what it can, writes conflict markers, updates the index with unmerged stages, and exits non-zero. Because the stderr is not the "lacks the necessary blob" case, control falls through to git apply --reject — which re-applies the entire same diff from scratch. For hunks git apply -3 already turned into conflict markers, --reject now also drops a .rej. Hence markers and .rej for the same hunks.
The _merge_file_fallback branch is fine (markers only). The bug is the --reject re-run after a partial -3 merge.
Expected
A given hunk should be represented once — either as a conflict marker (preferred, since -3 already produced a resolvable index state) or as a .rej, never both.
Suggested fix
When git apply -3 partially applied (non-zero exit, not the missing-blob case), don't re-run --reject over the full diff. Options:
- Trust the
-3 result: the markers/index ARE the conflict representation — just return False and let the caller report markers.
- Or, if
--reject is still wanted for genuinely-unapplied files, first git checkout/reset the paths -3 already touched so the two strategies don't overlap.
Secondary observation (possibly separate issue)
_clean_orphaned_files did not remove a stale managed file (.rhiza/tests/integration/test_workflow_stubs.py) that was dropped from the manifest in an earlier release — the sync logged "No orphaned files to clean up". It appears orphan detection only considers files present in the immediately previous lock; a file that fell out of the manifest in an earlier sync and was never cleaned becomes a permanent orphan. Worth confirming the comparison set.
Repro
Sync any sufficiently-diverged downstream repo from an older ref: to a newer one and observe *.rej files for hunks the index merge already turned into conflict markers.
Filed from downstream tracking issue Jebel-Quant/rhiza-hooks#200.
Summary
During
rhiza sync(3-way merge), a diverged target ends up with both git conflict markers (unmerged index stages) and duplicate*.rejfiles for the same hunks. The.rejfiles are spurious — the hunks are already represented as conflict markers / index entries — but they make downstream resolution fragile and noisy.Observed in
rhiza-hookssyncing v0.18.8 → v0.19.3: 14 files left with<<<<<<< / >>>>>>>markers (andgit ls-files -uindex stages) and 33*.rejfiles, all of which were verified spurious (already applied, or byte-identical to the upstream bundle source).Root cause
rhiza/models/_git_utils.py::_apply_diff(v0.17.6), lines ~356–395:When
git apply -3does have the blobs, it performs a real 3-way merge: it applies what it can, writes conflict markers, updates the index with unmerged stages, and exits non-zero. Because the stderr is not the "lacks the necessary blob" case, control falls through togit apply --reject— which re-applies the entire same diff from scratch. For hunksgit apply -3already turned into conflict markers,--rejectnow also drops a.rej. Hence markers and.rejfor the same hunks.The
_merge_file_fallbackbranch is fine (markers only). The bug is the--rejectre-run after a partial-3merge.Expected
A given hunk should be represented once — either as a conflict marker (preferred, since
-3already produced a resolvable index state) or as a.rej, never both.Suggested fix
When
git apply -3partially applied (non-zero exit, not the missing-blob case), don't re-run--rejectover the full diff. Options:-3result: the markers/index ARE the conflict representation — justreturn Falseand let the caller report markers.--rejectis still wanted for genuinely-unapplied files, firstgit checkout/reset the paths-3already touched so the two strategies don't overlap.Secondary observation (possibly separate issue)
_clean_orphaned_filesdid not remove a stale managed file (.rhiza/tests/integration/test_workflow_stubs.py) that was dropped from the manifest in an earlier release — the sync logged "No orphaned files to clean up". It appears orphan detection only considers files present in the immediately previous lock; a file that fell out of the manifest in an earlier sync and was never cleaned becomes a permanent orphan. Worth confirming the comparison set.Repro
Sync any sufficiently-diverged downstream repo from an older
ref:to a newer one and observe*.rejfiles for hunks the index merge already turned into conflict markers.Filed from downstream tracking issue Jebel-Quant/rhiza-hooks#200.