Skip to content

fix(node): dedupe mirror and canonical repo rows on list surfaces (#6)#73

Open
beardthelion wants to merge 1 commit into
mainfrom
fix/dedup-mirror-rows-canonical-owner
Open

fix(node): dedupe mirror and canonical repo rows on list surfaces (#6)#73
beardthelion wants to merge 1 commit into
mainfrom
fix/dedup-mirror-rows-canonical-owner

Conversation

@beardthelion

Copy link
Copy Markdown
Collaborator

Summary

Profile and repo-list surfaces rendered the same logical repo twice when a short-owner peer mirror row and the canonical did:key: row both existed. This collapses them to one card on the surfaces that were missing the dedup.

Motivation & context

Closes #6

node.gitlawb.com showed two nipmod cards on a profile: one from the peer mirror row (owner_did = "z6Mk…", description "mirrored from peer") and one from the canonical row (owner_did = "did:key:z6Mk…"). The paged repo list already deduped these in SQL, but the non-paged GET /api/v1/repos legacy path and list_federated_repos returned every matching row, so both showed up.

Kind of change

  • Bug fix
  • Feature
  • Security fix
  • Docs
  • Tests / CI
  • Refactor (no behavior change)
  • Breaking or protocol change (issue required first)

What changed

Crate touched: gitlawb-node.

  • Added dedupe_canonical_repos in api/repos.rs: groups rows by (normalized owner, name) (the key segment after the last :, so did:key:z6Mk… and the bare z6Mk… mirror row collapse together), keeps the canonical row (non-mirror beats "mirrored from peer", ties broken by earliest created_at), and carries the group's most recent updated_at onto the survivor so a gossip push that only touched the mirror row still floats the repo to the top. This matches the existing SQL dedup in Db::list_all_repos_paged.
  • Applied it at the two non-paged surfaces: the legacy list_repos fallback and list_federated_repos. As a side effect the legacy path's X-Total-Count now counts logical repos rather than raw rows, consistent with the paged path.
  • Added a repos::tests module covering canonical-wins, distinct-repos-preserved, same-owner-different-repo, and the mirror tie-break.

How a reviewer can verify

cargo test --bin gitlawb-node repos::tests
# Against a node that has both a mirror and a canonical row for one repo:
curl -fsSL 'http://<node>/api/v1/repos?owner=z6Mkwbud...'  # one record, owner_did = did:key:..., real description

Before you request review

  • Scope is one logical change; no unrelated churn
  • cargo test --workspace passes locally
  • New behavior is covered by tests (required for fixes)
  • cargo fmt --all and cargo clippy --workspace --all-targets -- -D warnings are clean
  • Commit titles use Conventional Commits (feat(...), fix(...), docs(...))
  • Docs / .env.example updated if behavior or config changed (or N/A)
  • Checked existing PRs so this isn't a duplicate

Notes for reviewers

The dedup logic now lives in two places: the SQL DISTINCT ON in Db::list_all_repos_paged and this Rust helper for the non-paged surfaces. They use the same preference rules and the helper's doc comment flags that they must stay in sync. Consolidating both behind one path is possible later but would change the legacy "return all rows" contract that peer/CLI callers rely on, so I kept it out of scope.

The non-paged GET /api/v1/repos legacy path and list_federated_repos
returned both the short-owner peer mirror row and the canonical did:key
row for the same logical repo, so profiles rendered the repo twice. Only
the paged path collapsed them, in SQL.

Add a dedupe_canonical_repos helper that groups by (normalized owner,
name), keeps the canonical non-mirror row (tie broken by earliest
created_at), and carries the group's latest updated_at onto the
survivor, matching the paged SQL dedup. Apply it at both non-paged
surfaces and cover it with unit tests.
@coderabbitai

coderabbitai Bot commented Jun 20, 2026

Copy link
Copy Markdown

Warning

Review limit reached

@beardthelion, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 53 seconds. Learn how PR review limits work.

To continue reviewing without waiting, enable usage-based billing in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: c61717b4-2ee0-4ae6-b6e4-c9610984dec6

📥 Commits

Reviewing files that changed from the base of the PR and between e37ea7f and dcbad62.

📒 Files selected for processing (1)
  • crates/gitlawb-node/src/api/repos.rs
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/dedup-mirror-rows-canonical-owner

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Deduplicate mirrored repo rows with canonical did:key owner on profile/list surfaces

1 participant