Make set --users-without-explicit-perms scale with rule scope, not user count#28
Merged
Merged
Conversation
…er count The mode probed every active site user with permissionsInfo.repositories(source: API, first: 1) before applying the maps.yaml user selectors, then hydrated each surviving user with one UserByID request. The probe costs 225-350ms of server CPU per user regardless of batching (the resolver materializes every accessible repo before the LIMIT applies), so a 10k-user instance took 87 minutes at the degraded test settings and minutes at defaults. - Match mapping rules locally BEFORE the explicit-permission probes, so probes scale with the rule-matched user count instead of the instance user count. - Hydrate users in aliased 25-user UsersByIDBatch requests instead of one UserByID request per user (also speeds up get). - Measured on the 10k-user / 50k-repo test instance with the fixture maps: 5,210s -> 15.2s end to end. - Performance tier now measures the default configuration via the new set-users-without-explicit-perms-default-batch case; the batch=1 case stays local-only to keep exercising the batching loop. - dev/engineering-requests.md: add the presence-probe resolver internals (unconditional LoadUserPermissions full read, CTE not short-circuited by first:1, no index on source) and measured per-user costs, ready to submit upstream. Amp-Thread-ID: https://ampcode.com/threads/T-019eba6a-74bc-7109-bb95-b82cefaeaa10 Co-authored-by: Amp <amp@ampcode.com>
b586059 to
f073ea1
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
set --users-without-explicit-permsprobed every active site user withpermissionsInfo.repositories(source: API, first: 1)before applying the maps.yaml user selectors, then hydrated each surviving user with oneUserByIDrequest. The probe costs 225–350ms of server CPU per user regardless of alias batching — the Sourcegraph resolver materializes every accessible repo before the LIMIT applies (see the new "Presence-check resolver internals" section indev/engineering-requests.md). On the 10k-user / 50k-repo test instance the mode took 87 minutes at degraded test settings, minutes at defaults.Changes
UsersByIDBatchrequests instead of oneUserByIDper user (also speeds upget --users-without-explicit-perms/--created-after).set-users-without-explicit-perms-default-batchcase; the batch=1 case stays local-only to keep exercising the batching loop.dev/engineering-requests.md: added measured probe costs and resolver/SQL findings (unconditionalLoadUserPermissionsfull read, CTE not short-circuited byfirst:1, no index onsource) — the upstream presence/filter API request is ready to submit.Measured result
Dry-run against m.eks.m.ps.sgdev.org (10,002 users, 50,023 repos) with the fixture maps: 5,210s → 15.2s; 1 exists-probe batch instead of 10,002.
Behavioral note
With backups enabled, this mode's before/after snapshots now cover only the rule-matched users (the full set of users the run can mutate) instead of every user lacking explicit perms. Planned grants are identical; restore still covers the run's entire blast radius, and the narrower snapshot can no longer clobber concurrent changes to untouched users on restore.
Validation
uv run tests/run.py: 97/97 passed (ruff, pyright, 113 unit tests, fixture cases incl. both CLI and import runners, randomized invariants, markdownlint, actionlint).tests/run.py --live without-explicit-perms(seeded apply + read-back + restore) and--performancebaseline comparison.