Skip to content

fix(node): preserve promisor mirror mode on unknown withheld-paths lookup (#48)#69

Open
beardthelion wants to merge 1 commit into
mainfrom
fix/sync-preserve-promisor-on-unknown-withheld
Open

fix(node): preserve promisor mirror mode on unknown withheld-paths lookup (#48)#69
beardthelion wants to merge 1 commit into
mainfrom
fix/sync-preserve-promisor-on-unknown-withheld

Conversation

@beardthelion

@beardthelion beardthelion commented Jun 20, 2026

Copy link
Copy Markdown
Collaborator

Summary

Peer sync downgraded a still-withheld promisor mirror to a full clone on a transient withheld-paths outage. This makes the mirror-mode decision distinguish "unknown lookup" from "genuinely public" and preserve the existing promisor mode when the lookup is unknown.

Motivation & context

Closes #48

process_batch decided how to mirror a repo from the origin's anonymous withheld-paths answer. fetch_withheld returns Some(globs) on a 2xx and None on every failure (404, 5xx, network, parse). classify_mirror then mapped Some(non-empty) => Promisor but _ => Plain, collapsing both Some(empty) (genuinely public) and None (unknown) into Plain. For an existing promisor mirror that ran fetch_repo(.., Plain), which unset remote.origin.promisor + remote.origin.partialclonefilter and ran git fetch --refetch. So a transient outage of the withheld-paths endpoint against a still-withheld repo tore down the partial-clone config and forced a full refetch against an origin that still withholds blobs for anonymous callers, breaking syncs until the endpoint recovered. Sibling of #47, same fail-open-vs-fail-closed family.

Kind of change

  • Bug fix
  • Feature
  • Security fix
  • Docs
  • Tests / CI
  • Refactor (no behavior change)
  • Breaking or protocol change (issue required first)

What changed

gitlawb-node (crates/gitlawb-node/src/sync.rs):

  • Replace classify_mirror(withheld) with resolve_mirror_mode(withheld, exists, promisor). Some(non-empty) => Promisor and Some(empty) => Plain are unchanged. On None, an existing mirror preserves its promisor mode instead of downgrading; a fresh clone stays Plain (the existing fail-closed-at-the-git-layer behavior). A genuine public transition always returns Some(empty), never None, so preserving on None cannot mask one; it only suppresses the transient-error false positive.
  • Add a three-valued PromisorProbe (Promisor / NotPromisor / Unknown) read from git config --get remote.origin.promisor exit codes. git config --get collapses "key absent" (exit 1) and "git errored" (exit ≥2 / spawn failure) into the same non-zero exit otherwise; distinguishing them means a transient probe failure (Unknown) is not read as a definitive NotPromisor and cannot itself trigger the downgrade.
  • Probe only when the lookup is unknown and the mirror already exists (the sole case the answer changes the mode). One warn! marks the preserve path, derived from the resolved mode so it cannot drift from the decision.

A persistently unreachable withheld-paths route pins an existing promisor mirror until the endpoint recovers (the per-cycle warn is the operator-visible signal). This is an intentional confidentiality-over-liveness bias: a still-withheld repo's anonymous git read endpoint 404s, so a preserved mirror fetches nothing harmful, and only a public-but-route-permanently-dropped repo loses backfill liveness.

How a reviewer can verify

cargo test -p gitlawb-node sync
cargo test --workspace
cargo clippy --workspace --all-targets -- -D warnings

The core regression is resolve_preserves_promisor_on_unknown_lookup_for_existing_mirror: None + existing + on-disk promisor must stay Promisor, never downgrade. resolve_preserves_promisor_on_indeterminate_probe covers the probe-failure (Unknown) case. probe_reports_promisor_for_promisor_mirror / probe_reports_not_promisor_when_key_absent / probe_reports_unknown_on_git_error exercise the three probe outcomes against real git repos. plain_fetch_clears_promisor_config_on_transition is unchanged and still passes, proving a genuine Some(empty) public transition still clears promisor config.

Before you request review

  • Scope is one logical change; no unrelated churn
  • cargo test --workspace passes locally
  • New behavior is covered by tests (required for fixes)
  • cargo fmt --all and cargo clippy --workspace --all-targets -- -D warnings are clean
  • Commit titles use Conventional Commits (feat(...), fix(...), docs(...))
  • Docs / .env.example updated if behavior or config changed (or N/A)
  • Checked existing PRs so this isn't a duplicate

Notes for reviewers

Out of scope (follow-ups): typed error reasons in fetch_withheld (404 vs 5xx vs parse) and retry/backoff on a failed lookup; both deferred deliberately since the empty-vs-unknown distinction the fix needs already survives in the Option return. The withheld-paths endpoint is unauthenticated (pre-existing); node-level auth for it is worth a separate look.

Summary by CodeRabbit

  • Bug Fixes

    • Fixed unsafe mirror mode downgrade behavior when origin lookup fails, preventing potential data synchronization issues (resolves issue #48).
    • Improved robustness of mirror mode detection with better handling of uncertain lookup states.
  • Improvements

    • Enhanced existing mirror detection logic to preserve promisor mode when lookups are inconclusive, ensuring safer fallback behavior.

…okup (#48)

Peer sync decided mirror mode from the origin's withheld-paths answer via
classify_mirror, whose _ => Plain arm collapsed both Some(empty) (genuinely
public) and None (lookup 404'd / 5xx / network / parse error) into Plain. For an
existing promisor mirror that meant fetch_repo(.., Plain) unset
remote.origin.promisor + partialclonefilter and ran git fetch --refetch, so a
transient withheld-paths outage downgraded a still-withheld mirror to a full
clone and broke syncs until the endpoint recovered.

Replace classify_mirror with resolve_mirror_mode(withheld, exists, promisor):
only Some(empty) downgrades an existing mirror; None preserves the on-disk
promisor mode. A genuine public transition always returns Some(empty), never
None, so preserving on None cannot mask one; it only suppresses the
transient-error false positive.

Add a three-valued PromisorProbe read from git config --get exit codes so a
transient probe failure (Unknown) is not mistaken for a definitive NotPromisor
and cannot itself trigger the downgrade. One warn marks the preserve path,
derived from the resolved mode so it cannot drift.

Tests cover every resolve_mirror_mode arm (incl. the regression and the
indeterminate-probe case) and the probe's Promisor / NotPromisor / Unknown
outcomes against real git repos.
@coderabbitai

coderabbitai Bot commented Jun 20, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

sync.rs replaces the two-valued classify_mirror function with a three-valued PromisorProbe enum, a new existing_promisor_state probe (reading remote.origin.promisor via git config), and a resolve_mirror_mode function. process_batch is updated to invoke the probe conditionally and derive MirrorMode through resolve_mirror_mode, emitting a warning when promisor mode is preserved due to an unknown withheld-paths lookup. Unit and async tests are added to cover all resolution branches.

Changes

Three-valued mirror-mode resolution

Layer / File(s) Summary
PromisorProbe enum and resolve_mirror_mode function
crates/gitlawb-node/src/sync.rs
Adds PromisorProbe (Promisor, NotPromisor, Unknown) and resolve_mirror_mode(withheld, exists, promisor) replacing classify_mirror; preserves MirrorMode::Promisor when withheld-paths lookup is unknown and the mirror exists, and updates the fetch_withheld doc comment to reference the new resolver.
existing_promisor_state on-disk probe
crates/gitlawb-node/src/sync.rs
Adds existing_promisor_state(repo) -> PromisorProbe that shells out to git config --get remote.origin.promisor and maps exit codes to Promisor, NotPromisor, or Unknown.
process_batch wiring
crates/gitlawb-node/src/sync.rs
Updates process_batch to compute exists/lookup_unknown, conditionally invoke existing_promisor_state only when needed, resolve MirrorMode via resolve_mirror_mode, and warn when promisor mode is preserved due to an unknown lookup.
Unit and async tests
crates/gitlawb-node/src/sync.rs
Replaces classify_mirror tests with resolve_mirror_mode coverage across all branch combinations; adds async tests for existing_promisor_state outputs (Promisor, NotPromisor, Unknown).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • Gitlawb/node#35: Introduces the original process_batch mirror-mode derivation and classify_mirror logic in sync.rs that this PR directly replaces.

Poem

🐇 Hoppity-hop through the mirror hall,
Three states now guard each promisor call —
Unknown withheld? We shall not downgrade!
The on-disk config comes to our aid.
No transient outage shall flatten a clone,
This rabbit ensures promisor keeps its throne! 🌟

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix(node): preserve promisor mirror mode on unknown withheld-paths lookup (#48)' accurately reflects the main change—preventing downgrades of promisor mirrors on transient endpoint failures.
Description check ✅ Passed The description follows the template structure with all major sections completed: Summary, Motivation & context, Kind of change, What changed, How to verify, pre-review checklist, and Notes for reviewers.
Linked Issues check ✅ Passed The PR fully addresses issue #48: introduces resolve_mirror_mode to distinguish unknown lookups from public confirmations, adds PromisorProbe to detect transient probe failures, and includes comprehensive tests validating all paths.
Out of Scope Changes check ✅ Passed All changes are scoped to the sync.rs file and directly address the regression in issue #48; the PR intentionally defers typed error reasons and retry/backoff as follow-ups.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/sync-preserve-promisor-on-unknown-withheld

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
crates/gitlawb-node/src/sync.rs (1)

508-510: ⚡ Quick win

Minor documentation inaccuracy: exit code 1 does not mean "not a git repo".

The comment states exit 1 occurs when the key is absent "or the directory is not a git repo." However, git config --get exits 1 specifically when the key is absent in a valid git repository. For an invalid/non-existent repo, git exits 128 (which correctly maps to Unknown at line 531). The test at line 854 confirms this behavior.

📝 Suggested doc fix
-/// `git config --get` exits 0 when the key is set, 1 when it is absent (or the
-/// directory is not a git repo) — both definitive `NotPromisor` — and other
+/// `git config --get` exits 0 when the key is set, 1 when it is absent in a valid
+/// git repo — a definitive `NotPromisor` — and other
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/gitlawb-node/src/sync.rs` around lines 508 - 510, The documentation
comment for git config exit codes contains an inaccuracy. Update the comment
around the git config validation logic to clarify that exit code 1 specifically
indicates the key is absent in a valid git repository, not that the directory is
not a git repo. Remove the "or the directory is not a git repo" clause from the
exit code 1 description, since exit code 128 (not 1) is what git returns for
invalid or non-existent repositories. This aligns the documentation with the
actual behavior confirmed by the test at line 854.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@crates/gitlawb-node/src/sync.rs`:
- Around line 508-510: The documentation comment for git config exit codes
contains an inaccuracy. Update the comment around the git config validation
logic to clarify that exit code 1 specifically indicates the key is absent in a
valid git repository, not that the directory is not a git repo. Remove the "or
the directory is not a git repo" clause from the exit code 1 description, since
exit code 128 (not 1) is what git returns for invalid or non-existent
repositories. This aligns the documentation with the actual behavior confirmed
by the test at line 854.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: de5c0390-a193-4b91-81bf-99bb816e21f3

📥 Commits

Reviewing files that changed from the base of the PR and between e37ea7f and 68465f8.

📒 Files selected for processing (1)
  • crates/gitlawb-node/src/sync.rs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Peer sync downgrades still-withheld mirrors to Plain on a transient withheld-paths error

1 participant