Skip to content

fix(node): skip withheld-walk when no path-scoped rule can withhold#60

Open
beardthelion wants to merge 2 commits into
mainfrom
fix/withheld-walk-short-circuit
Open

fix(node): skip withheld-walk when no path-scoped rule can withhold#60
beardthelion wants to merge 2 commits into
mainfrom
fix/withheld-walk-short-circuit

Conversation

@beardthelion

@beardthelion beardthelion commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator

Summary

Public clones and fetches were paying for an authorization walk they could never benefit from. On every clone/fetch the serve path enumerated every blob path across every ref (a blocking git ls-tree -r per ref) to compute which blobs to withhold, then threw the result away and served a normal pack whenever nothing was withheld — which is always the case for a repo with no path-scoped visibility rules, i.e. the common public repo. The cost got worse with #42, which widened the walk from heads/tags to every ref.

This skips the walk entirely when no rule can withhold anything.

What changed

A shared predicate, has_path_scoped_rule, now gates both consumers of the walk. The serve path (git_upload_pack) skips the blocking walk and serves the pack directly; the post-push replication path's existing empty-rules short-circuit is generalized to also cover root-only rule sets, so the two call sites can't drift.

Why it's safe

The short-circuit is only sound because the whole-repo / access gate already runs before both call sites: a denying root rule 404s the caller on the serve path, and suppresses announcement on the replication path, before the predicate is ever consulted. Any caller that reaches the predicate has therefore already cleared repo-level access — and with no path-scoped rule left, there is nothing to withhold. Served-pack contents are unchanged on the success path.

The predicate's doc comment records this precondition explicitly, plus the dependency on validate_path_glob rejecting /** (the only non-/ glob that would match every path).

One intended behavioural difference, on the error path only: a corrupt public/root-only repo whose ref walk would have thrown now skips that walk and goes straight to upload_pack, so a spurious 500 originating in the walk no longer surfaces there. This is the wasted-walk overhead going away, not a change to what content is served.

Tests

  • Predicate unit tests (empty / single-root / path-scoped / mixed / multi-root).
  • A gate-aware safety invariant test: for any caller that has passed the / gate, an empty or root-only rule set yields an empty withheld set — pinning the invariant the skip depends on against future changes to the matching logic.
  • Full gitlawb-node suite: 117 passing; cargo fmt --check and cargo clippy clean.

The serve/replication HTTP endpoints have no integration-test harness in this crate, so the call-site wiring is covered at the predicate level rather than via new end-to-end tests.

Closes #50.

Summary by CodeRabbit

  • Bug Fixes

    • Fixed visibility rule enforcement logic for repository replication handling in certain configurations.
  • Performance Improvements

    • Optimized git pack operations by skipping unnecessary processing when repository visibility configuration contains only whole-repository rules.
  • Tests

    • Extended test coverage for visibility rule validation, including edge cases and safety checks.

…t-circuit

Shared predicate (rules.iter().any(path_glob != "/")) both serve and
replication paths will use to skip the per-blob withheld walk when no rule
can withhold. Doc records the "/"-gate precondition and the validate_path_glob
dependency. Adds unit tests plus a gate-aware safety invariant test.
)

The serve path (git_upload_pack) ran the full per-ref withheld_blob_oids
walk on every clone/fetch before falling back to a plain upload_pack, even
for repos with no path-scoped rules where the walk can never withhold
anything. Cost grew with #42 (per-ref git ls-tree -r).

Guard both call sites on has_path_scoped_rule: the serve path skips the
spawn_blocking walk entirely and serves upload_pack directly; the
replication path generalizes its is_empty() short-circuit to also cover
root-only rule sets. Safe because the whole-repo "/" gate runs before both
sites, so a denying root rule 404s/withholds upstream and never reaches the
predicate. No change to served-pack contents on the success path.
@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Adds a new has_path_scoped_rule predicate to visibility_pack that returns true when any visibility rule targets a sub-path (i.e., path_glob != "/"). This predicate is then used in git_upload_pack to skip the spawn_blocking per-blob withheld-OID walk entirely for root-only rule sets, and in replication enforcement to replace an is_empty() check with the equivalent predicate check.

Changes

Short-circuit withheld-blob walk for non-path-scoped rules

Layer / File(s) Summary
has_path_scoped_rule predicate and tests
crates/gitlawb-node/src/git/visibility_pack.rs
Introduces pub fn has_path_scoped_rule(rules: &[VisibilityRule]) -> bool that returns true only when at least one rule has path_glob != "/". Adds unit tests for empty, root-only, mixed, and duplicate-root inputs, plus a safety-invariant test confirming withheld_blob_oids returns empty when the predicate is false.
Upload-pack and replication enforcement wiring
crates/gitlawb-node/src/api/repos.rs
In git_upload_pack, adds a fast path that calls smart_http::upload_pack directly when has_path_scoped_rule is false, bypassing spawn_blocking. In replication enforcement, the "withhold nothing" match arm is now triggered by !has_path_scoped_rule(rules) instead of rules.is_empty().

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • Gitlawb/node#25: Adds the visibility-rule authorization flow in api/repos.rs that this PR's predicate and short-circuit directly build upon.
  • Gitlawb/node#28: Introduced the original withheld_blob_oids walk in git_upload_pack that this PR short-circuits with has_path_scoped_rule.
  • Gitlawb/node#34: Modified the same git_upload_pack/git_receive_pack withheld-blob logic that this PR further refines for non-path-scoped rule sets.

Suggested reviewers

  • kevincodex1

Poem

🐇 No path-scoped rule? Then why should I toil,
Walking each blob through mud and through soil?
A quick little check — has_path_scoped_rule
And off goes the pack, light, snappy, and cool.
This bunny hops fast when the shortcut is true! 🚀

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: introducing an optimization to skip the withheld-walk when no path-scoped rules exist, which is the core performance improvement in this PR.
Linked Issues check ✅ Passed The PR fully implements the objectives from issue #50: it introduces the has_path_scoped_rule predicate and applies it to both the serve path and replication path to skip unnecessary withheld-blob walks.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the optimization objective: the has_path_scoped_rule predicate addition and its application to both call sites in git_upload_pack and git_receive_pack logic.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/withheld-walk-short-circuit

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Serve path: short-circuit withheld walk when no path-scoped rule can withhold

1 participant