Skip to content

Make set --users-without-explicit-perms scale with rule scope, not user count#28

Merged
marcleblanc2 merged 1 commit into
mainfrom
perf-users-without-explicit-perms
Jun 12, 2026
Merged

Make set --users-without-explicit-perms scale with rule scope, not user count#28
marcleblanc2 merged 1 commit into
mainfrom
perf-users-without-explicit-perms

Conversation

@marcleblanc2

Copy link
Copy Markdown
Collaborator

Problem

set --users-without-explicit-perms probed every active site user with permissionsInfo.repositories(source: API, first: 1) before applying the maps.yaml user selectors, then hydrated each surviving user with one UserByID request. The probe costs 225–350ms of server CPU per user regardless of alias batching — the Sourcegraph resolver materializes every accessible repo before the LIMIT applies (see the new "Presence-check resolver internals" section in dev/engineering-requests.md). On the 10k-user / 50k-repo test instance the mode took 87 minutes at degraded test settings, minutes at defaults.

Changes

  • Match mapping rules locally BEFORE the explicit-permission probes, so probes scale with the rule-matched user count instead of the instance user count. Set intersection is commutative: selected users (matched ∧ no explicit perms) are identical either way.
  • Hydrate users in aliased 25-user UsersByIDBatch requests instead of one UserByID per user (also speeds up get --users-without-explicit-perms / --created-after).
  • Performance tier now measures the default configuration via the new set-users-without-explicit-perms-default-batch case; the batch=1 case stays local-only to keep exercising the batching loop.
  • dev/engineering-requests.md: added measured probe costs and resolver/SQL findings (unconditional LoadUserPermissions full read, CTE not short-circuited by first:1, no index on source) — the upstream presence/filter API request is ready to submit.

Measured result

Dry-run against m.eks.m.ps.sgdev.org (10,002 users, 50,023 repos) with the fixture maps: 5,210s → 15.2s; 1 exists-probe batch instead of 10,002.

Behavioral note

With backups enabled, this mode's before/after snapshots now cover only the rule-matched users (the full set of users the run can mutate) instead of every user lacking explicit perms. Planned grants are identical; restore still covers the run's entire blast radius, and the narrower snapshot can no longer clobber concurrent changes to untouched users on restore.

Validation

  • uv run tests/run.py: 97/97 passed (ruff, pyright, 113 unit tests, fixture cases incl. both CLI and import runners, randomized invariants, markdownlint, actionlint).
  • Live dry-run against the test instance planned exactly the expected 4 grants (2 users × 2 repos).
  • Not yet run: tests/run.py --live without-explicit-perms (seeded apply + read-back + restore) and --performance baseline comparison.

…er count

The mode probed every active site user with
permissionsInfo.repositories(source: API, first: 1) before applying the
maps.yaml user selectors, then hydrated each surviving user with one
UserByID request. The probe costs 225-350ms of server CPU per user
regardless of batching (the resolver materializes every accessible repo
before the LIMIT applies), so a 10k-user instance took 87 minutes at the
degraded test settings and minutes at defaults.

- Match mapping rules locally BEFORE the explicit-permission probes, so
  probes scale with the rule-matched user count instead of the instance
  user count.
- Hydrate users in aliased 25-user UsersByIDBatch requests instead of
  one UserByID request per user (also speeds up get).
- Measured on the 10k-user / 50k-repo test instance with the fixture
  maps: 5,210s -> 15.2s end to end.
- Performance tier now measures the default configuration via the new
  set-users-without-explicit-perms-default-batch case; the batch=1 case
  stays local-only to keep exercising the batching loop.
- dev/engineering-requests.md: add the presence-probe resolver internals
  (unconditional LoadUserPermissions full read, CTE not short-circuited
  by first:1, no index on source) and measured per-user costs, ready to
  submit upstream.

Amp-Thread-ID: https://ampcode.com/threads/T-019eba6a-74bc-7109-bb95-b82cefaeaa10
Co-authored-by: Amp <amp@ampcode.com>
@marcleblanc2 marcleblanc2 force-pushed the perf-users-without-explicit-perms branch from b586059 to f073ea1 Compare June 12, 2026 07:48
@marcleblanc2 marcleblanc2 merged commit 2dc3620 into main Jun 12, 2026
6 checks passed
@marcleblanc2 marcleblanc2 deleted the perf-users-without-explicit-perms branch June 12, 2026 07:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant