Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -201,10 +201,16 @@ organization sync maps SAML groups to Sourcegraph org membership. Read
CLI lives in `src/src_auth_perms_sync/`; invoke with `uv run src-auth-perms-sync`.
Strict pyright covers the package. Root modules are entrypoints only:

- `cli.py` — `main()`, arg parsing, owns the CLI description.
- `cli.py` — `main()`, arg parsing, owns the CLI description. Module
wrappers (`Get`/`Set`/`Restore`/`SyncSamlOrgs`) return result dataclasses
and never install logging handlers; only `main()` runs CLI-mode logging.
- `shared/` — cross-workflow helpers: Sourcegraph auth-provider/user list
helpers, shared GraphQL operations and TypedDicts, site-config validation,
and SAML group parsing.
and SAML group parsing. `shared/backups.py` defines `RunPaths`: every
filesystem path for one run, resolved once at the edge
(`resolve_run_paths`) and threaded explicitly — never recompute paths
from cwd or globals below the edge, and honor `run_paths.write_files`
(False under `--no-files`) before any disk write.

Business workflows live in packages:

Expand Down
45 changes: 40 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,14 +137,44 @@ config = src.Config(
apply=False, # Dry run (default), set to True to make changes
)

succeeded = src.Set(config)
result = src.Set(config) # truthy on success; result.paths has run artifacts

# Discovery returns the auth provider and code host data in memory, so you
# can assemble mapping rules without re-parsing the generated YAML files:
get_result = src.Get(config)
for provider in get_result.auth_providers:
...
for code_host in get_result.code_hosts:
...

# Other command wrappers:
# succeeded = src.Get(config)
# succeeded = src.Restore(config)
# succeeded = src.SyncSamlOrgs(config)
# result = src.Restore(config)
# result = src.SyncSamlOrgs(config)
```

Module mode never touches your `logging` handlers or the root logger — your
application's logging config stays in charge. To see progress messages:

```python
import logging

logging.basicConfig(level=logging.INFO) # or your own handlers
logging.getLogger("src_auth_perms_sync").setLevel(logging.INFO)
logging.getLogger("src_py_lib").setLevel(logging.INFO)
```

To receive structured wide events programmatically, pass an event sink:

```python
events = src.InMemoryEventSink()
src.Get(config, event_sink=events) # or src.CallbackEventSink(my_function)
```

To run fully disk-free (no generated YAML, snapshots, or log file), set
`no_files=True`. Combined with `apply=True` this also requires
`no_backup=True`, because skipping files gives up the before/after
snapshots that make `--apply` reversible.

## Inputs

- Environment variables (CLI), or src.Config args (Python import)
Expand All @@ -154,7 +184,12 @@ succeeded = src.Set(config)

- YAML maps file
- By default: `src-auth-perms-sync-runs/<src_endpoint>/maps.yaml`
- Or pass `--maps-path ./path/to/maps.yaml`
- Or pass `--maps-path ./path/to/maps.yaml` (works for both `get` and `set`,
so the maps file can live outside the generated artifacts tree)
- `--artifacts-dir DIR` moves the whole artifacts tree (generated YAML,
snapshots, logs); the default is `./src-auth-perms-sync-runs`
- `--no-files` writes nothing to disk; with `--apply` it also requires
`--no-backup`
- A list of mapping rules
- Each mapping rule takes
- A map of filters for users
Expand Down
458 changes: 458 additions & 0 deletions dev/PLAN.md

Large diffs are not rendered by default.

19 changes: 11 additions & 8 deletions dev/TODO.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,15 @@
# TODO

## Follow-up: in-memory mapping rules for Set (PLAN.md Track A Phase A4)

The rest of [PLAN.md](./PLAN.md) is implemented (src-py-lib v0.3.0 +
the consumer refactor-logging-and-files PR). The one deliberately
deferred piece, marked optional in the plan: let module callers pass
parsed mapping rules to `Set` instead of a maps file, so the full
get → assemble → dry-run loop never touches disk. Snapshots must stay
on disk for `--apply` (reversibility invariant); `no_files` + `apply`
must keep requiring `no_backup`.

## High priority: Remote trigger on demand

- Sourcegraph webhook for new user coming in v7.4.0
Expand All @@ -13,18 +23,11 @@
- How do we avoid stampedes (e.g., bulk repo sync triggering thousands
of re-runs)?

## High priority: Reduce worst-case full-permission sync load
## Medium priority: Reduce worst-case full-permission sync load

- Use the stress-run evidence in
[engineering-requests.md](./engineering-requests.md)
to request Sourcegraph bulk explicit-permission read and write APIs.
2026-06-12: presence-probe resolver internals and measured costs added
there (see "Presence-check resolver internals"); request is ready to
submit. Client-side, `set --users-without-explicit-perms` now matches
rules before probing and hydrates users in aliased batches (5,210s →
15s on the 10k-user instance), but `get --users-without-explicit-perms`
still probes every active user — only a server-side presence/filter API
fixes that.
New evidence 2026-06-10: the whole-instance apply (1,150 repo
overwrites x 10,002 bindIDs each at parallelism 16) crashed the test
instance's Postgres ("connection refused", "unexpected EOF"); the
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ classifiers = [
dependencies = [
"json5>=0.14.0",
"pyyaml>=6.0.3",
"src-py-lib[otel]==0.2.1",
"src-py-lib[otel]==0.3.0",
]
keywords = [
"Sourcegraph"
Expand Down
29 changes: 27 additions & 2 deletions src/src_auth_perms_sync/__init__.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,36 @@
"""Importable API for src-auth-perms-sync."""
"""Importable API for src-auth-perms-sync.

from .cli import Config, Get, Restore, Set, SyncSamlOrgs
Module-mode commands never touch stdlib logging handlers; configure your own
`logging` handlers and levels (e.g. on the `src_auth_perms_sync` logger) to
see progress messages. Pass an `EventSink` (re-exported from `src_py_lib`)
to receive structured wide events programmatically.
"""

from src_py_lib import (
CallbackEventSink,
CompositeEventSink,
EventSink,
InMemoryEventSink,
JSONLEventSink,
NullEventSink,
)

from .cli import CommandResult, Config, Get, GetResult, Restore, Set, SyncSamlOrgs
from .shared.backups import RunPaths

__all__ = [
"CallbackEventSink",
"CommandResult",
"CompositeEventSink",
"Config",
"EventSink",
"Get",
"GetResult",
"InMemoryEventSink",
"JSONLEventSink",
"NullEventSink",
"Restore",
"RunPaths",
"Set",
"SyncSamlOrgs",
]
Loading