Description
Update file comparison logic in attribution calculation and checkpoint content detection to respect git's line-ending normalization settings (core.autocrlf, clean/smudge filters). Currently, raw byte comparison between working tree files and committed blobs can produce false positives when git normalizes line endings on commit.
The fix should:
- Use
git diff --quiet <path> as the primary check for whether a file has meaningful changes, before falling back to raw blob hash comparison
- Ensure attribution calculations don't inflate agent percentages due to CRLF/LF differences
- Add a regression test that enables
core.autocrlf=true and verifies line-ending-only changes don't affect attribution
Why
On Windows or in cross-platform teams where core.autocrlf=true is common, git normalizes line endings on commit (CRLF in working tree → LF in repository). If Partio compares raw on-disk bytes to committed blob hashes, every file touched by the agent could appear "modified" even when the only difference is line endings. This leads to inflated attribution percentages and potentially spurious checkpoint content.
Acceptance criteria
- Attribution diff comparison respects git's clean/smudge filters and
core.autocrlf setting
- Files with only line-ending differences (CRLF vs LF) are not counted as agent-modified
- Checkpoint content comparison uses
git diff --quiet as the primary cleanliness check before falling back to raw blob hashing
- Regression test verifies that
core.autocrlf=true does not inflate attribution percentages
- Windows and cross-platform CI environments produce consistent attribution results
Source
Inspired by entireio/cli PR #913
Description
Update file comparison logic in attribution calculation and checkpoint content detection to respect git's line-ending normalization settings (
core.autocrlf, clean/smudge filters). Currently, raw byte comparison between working tree files and committed blobs can produce false positives when git normalizes line endings on commit.The fix should:
git diff --quiet <path>as the primary check for whether a file has meaningful changes, before falling back to raw blob hash comparisoncore.autocrlf=trueand verifies line-ending-only changes don't affect attributionWhy
On Windows or in cross-platform teams where
core.autocrlf=trueis common, git normalizes line endings on commit (CRLF in working tree → LF in repository). If Partio compares raw on-disk bytes to committed blob hashes, every file touched by the agent could appear "modified" even when the only difference is line endings. This leads to inflated attribution percentages and potentially spurious checkpoint content.Acceptance criteria
core.autocrlfsettinggit diff --quietas the primary cleanliness check before falling back to raw blob hashingcore.autocrlf=truedoes not inflate attribution percentagesSource
Inspired by entireio/cli PR #913