SK-536: Add sx vault copy for lossless cross-vault migration#152
Conversation
- Sync vendored schema; add importAuditEvents + assetUsageEvents ops - Team/user-scoped SetAssetInstallation via installations input - GetUsageStats via assetUsageEvents (paged) + shared summarizer - RecordUsageEvents sends actor for original attribution
- New Vault.ImportAuditEvents: lossless audit-history import - Path/Git append JSONL verbatim (git via vault tx); Sleuth delegates to importAuditEvents, chunked to the 5000 cap - Round-trip test on a path vault
- AddTeamRepository/RemoveTeamRepository via updateTeam skillsRepositories - Resolve repo identifiers (URL/slug/SSH) to GIDs by paging org repos - Re-send members on update so the replace preserves membership - Unit test for trailingOwnerName
- internal/vaultcopy: backend-agnostic engine copying teams, bots, assets+versions, installation scopes, audit, and usage between vaults - Vault.ReadUsageEvents (raw usage read) for path/git/sleuth - ManifestAssetScopes capability for complete file-backed scope reads - sx vault copy --from/--to with --only, --dry-run, always-preview-then-yes - Path-to-path round-trip test asserting losslessness + idempotency
- Register org-wide (scopeless) assets on the destination instead of copying files without an install row - ManifestAssetScopes returns presence so files-only assets aren't mis-registered as org-wide - Count scopes during dry-run preview; simplify version counting - Warn when source teams exceed the list limit (truncation) - Round-trip test covers an org-wide asset
- QueryAuditEvents pages assetAuditLog at the server's 50-cap with endCursor (was a single first=1000 call that the server rejects) - Add AssetInstallScopes: read an asset's installation scopes from any vault (Sleuth via installations query; file-backed via manifest), so scopes copy even when the source is skills.new - Engine uses the backend-neutral assetScopeReader capability
- importAuditEvents: encode event data as a JSON string for the JSONString scalar (was sending a raw object; server rejected with "must be str, not dict"). Verified against deployed skills.new. - Copy engine: a category-level failure (e.g. audit) is now a warning and the copy continues, so it can't abort later categories (usage).
- RecordUsageEvents posts in batches of 1000 instead of one giant /api/usage request, so a whole-vault usage import is enqueued in manageable chunks
|
Claude finished @detkin's task in 3m 43s —— View job Review: PR #152 — Add sx vault copySolid implementation of a backend‑agnostic copy engine; the main concerns are (1) a 50‑asset cap on Sleuth [HIGH] Sleuth source silently caps assets to 50 per type — breaks lossless claimLocation: Issue: The copy engine asks for Suggestion: Either page [HIGH] AssetInstallScopes can silently drop scopes due to slug vs. name mismatchLocation: Issue: Suggestion: Match on slug and name (slug wins), mirroring if strings.EqualFold(resp.Vault.Assets.Nodes[i].GetSlug(), name) ||
strings.EqualFold(resp.Vault.Assets.Nodes[i].GetName(), name) {Also guard against the search returning more than 50 hits for very generic slugs — pin to exact match server-side if possible, or page. [MEDIUM] Preview replays the full audit + usage reads, doubling wall time on large vaultsLocation: Issue: Suggestion: Either (a) thread a "counts only" mode through the engine so DryRun can use a cheap [MEDIUM] Audit/usage import is non-idempotent; partial failure causes duplicates on retryLocation: Issue: Both methods chunk through events with Suggestion: At minimum, document this in the [LOW] Bot creation against Sleuth destination auto-issues an API key that is silently discardedLocation: Issue: Suggestion: After [LOW] Scope copy ordering can leave assets uploaded without scopes if scope read failsLocation: Issue: For each asset, Suggestion: Either (a) read the scope set up front (before any uploads) so a read failure aborts the asset cleanly with no half-state, or (b) include a separate [LOW] `sleuthBotListSoftCap = 1000` is silently relied on by the copy engineLocation: Issue: Suggestion: Either page the bot list (when/if the server adds a connection), or have [LOW] `assetUsageEvents` query selects `botName` but the value is never usedLocation: Issue: The GraphQL query asks for Suggestion: Drop [LOW] Round-trip test only exercises path→path with one audit/usage event eachLocation: Issue: The test imports a single audit event and a single usage event, so the chunking paths in Suggestion: Add a unit test that seeds more events than |
- Sleuth SetAssetInstallations sets an asset's full install set in a single server call, so multi-scope assets no longer lose all but the last scope to the replace-on-set mutation - Engine prefers the bulk call for >1 scope, falls back to per-target on error (preserves best-effort partial success) - Unit tests for the dispatch logic
|
Claude finished @detkin's task in 2m 55s —— View job PR Review: #152 — Add sx vault copyMulti-scope replace‑on‑set and chunked usage POSTs are now fixed, but two HIGH issues from the prior review (the 50‑asset cap on the Sleuth source, and the slug/name mismatch in [HIGH] Sleuth source still silently caps assets at 50 per type — breaks the lossless claimLocation: Issue: Suggestion: Page through [HIGH] `AssetInstallScopes` still uses name-only matching against an input that is a slug — scopes silently droppedLocation: Issue: Suggestion: Add if strings.EqualFold(node.GetSlug(), name) ||
strings.EqualFold(node.GetName(), name) {Also pin the [MEDIUM] Preview still replays the full audit + usage reads, doubling wall time on large vaultsLocation: Issue: Suggestion: Either thread a "counts only" mode through [MEDIUM] Audit/usage import remains non-idempotent — retry duplicates eventsLocation: Issue: Both methods chunk in a Suggestion: At minimum, update the [LOW] Bot creation discards the auto-issued API key — destination accumulates orphan keysLocation: Issue: Suggestion: Either delete the auto-issued key immediately after [LOW] Scope read happens after version upload — read failure leaves stranded asset versionsLocation: Issue: For each asset, Suggestion: Read scopes before uploading versions, so a scope-read failure aborts the asset cleanly with no half-state. Or add a [LOW] `sleuthBotListSoftCap = 1000` is silently relied on by the copy engineLocation: Issue: Suggestion: Either upgrade [LOW] `assetUsageEvents` query selects `botName` but the value is never consumedLocation: Issue: The query asks for Suggestion: Drop [LOW] Round-trip test does not exercise the chunking paths that are the novel logic in this PRLocation: Issue: The test seeds 1 audit event and 1 usage event. The new chunking logic in Suggestion: Add a unit test that seeds more events than |
- AssetInstallScopes returns present=false when a skills.new asset has no installations (only an explicit ORGANIZATION install is org-wide), so the copy no longer marks uninstalled source assets org-wide. (Note: skills.new still defaults uploads to org-wide server-side — separate follow-up.)
|
Claude finished @detkin's task in 2m 16s —— View job PR Review: #152 — Add sx vault copyThe latest commits fix multi-scope replace-on-set and tighten "zero-install ≠ org-wide", but the two HIGH issues from the prior review (the 50-asset cap on the Sleuth source and the slug/name mismatch in [HIGH] Sleuth source still silently caps assets at 50 per type — breaks the lossless claimLocation: Issue: Suggestion: Page through [HIGH] `AssetInstallScopes` still uses name-only matching against an input that is actually a slug — scopes silently droppedLocation: Issue: Suggestion: Add if strings.EqualFold(node.GetSlug(), name) ||
strings.EqualFold(node.GetName(), name) {Also pin the [MEDIUM] Preview still replays the full audit + usage reads, doubling wall time on large vaultsLocation: Issue: Suggestion: Either thread a "counts only" mode through [MEDIUM] Audit/usage import remains non-idempotent — retry duplicates eventsLocation: Issue: Both methods chunk in Suggestion: At minimum, update the warning text in [LOW] Bot creation discards the auto-issued API key — destination accumulates orphan keysLocation: Issue: Suggestion: Either delete the auto-issued key immediately after [LOW] Scope read happens after version upload — read failure leaves stranded asset versionsLocation: Issue: For each asset, Suggestion: Read scopes before uploading versions, so a scope-read failure aborts the asset cleanly. Or add a [LOW] `sleuthBotListSoftCap = 1000` is silently relied on by the copy engineLocation: Issue: Suggestion: Either upgrade [LOW] `assetUsageEvents` query selects `botName` but the value is never consumedLocation: Issue: The query asks for Suggestion: Drop [LOW] Round-trip test does not exercise the chunking paths that are the novel logic in this PRLocation: Issue: The test seeds 1 audit event and 1 usage event. The new chunking logic in Suggestion: Add a unit test that seeds more events than |
- HIGH: paginate Sleuth ListAssets (was capped at 50/type, silently truncating); prod now reads 90 assets/354 versions vs 79/249 - HIGH: AssetInstallScopes matches on slug (caller passes the slug), not display name — scopes were silently dropped when they differ - MEDIUM: --yes applies directly instead of preview-then-apply, so a large vault's audit/usage history isn't read twice - MEDIUM: document that audit/usage import is additive (retry duplicates) - LOW: drop unused botName from the usage export query
|
Claude finished @detkin's task in 2m 15s —— View job PR Review: #152 — Add sx vault copyBoth HIGH issues from the prior review are now fixed (asset pagination + slug-first scope matching), and the MEDIUM concerns are addressed at the minimum suggested level ( [RESOLVED ✓] 50-asset cap on Sleuth sourceLocation: Verified: [RESOLVED ✓] AssetInstallScopes slug vs name mismatchLocation: Verified: [RESOLVED ✓] Preview replays full audit + usage readsLocation: Verified: [RESOLVED ✓] `botName` selected but unusedLocation: Verified: [ADDRESSED] Audit/usage import non-idempotency now documentedLocation: Verified: The command long-help now states "audit and usage import is additive — re-running a copy whose audit/usage stage previously failed part-way will duplicate the already-imported events on the destination". This matches the suggested minimum mitigation. Stronger server-side dedup (by [LOW — open] Bot creation discards the auto-issued API key on Sleuth destinationsLocation: Issue: Suggestion: Either revoke the auto-issued key immediately after [LOW — open] Scope read still happens after version uploadLocation: Issue: For each asset, Suggestion: Read scopes before uploading versions so a scope-read failure aborts the asset cleanly with no half-state, or add a [LOW — open] `sleuthBotListSoftCap = 1000` warning isn't surfaced in the copy reportLocation: Issue: Suggestion: Either upgrade [LOW — open] Chunking paths (5000-event audit, 1000-event usage) still uncovered by testsLocation: Issue: Both the round-trip and the import e2e test seed only 1-3 events, far below Suggestion: Add a focused unit test against |
d97b170 to
745e9c2
Compare
|
Claude finished @detkin's task in 5m 8s —— View job PR Review: #152 — Add sx vault copyBoth prior HIGH issues are confirmed fixed (asset pagination at [RESOLVED ✓] 50-asset cap, slug/name mismatch, preview replay, `botName` selected-but-unusedVerified:
[MEDIUM — new] Bulk-scope fallback silently drops scopes on Sleuth destinationsLocation: Issue: When the bulk A concrete failure path: Suggestion: Either drop the failing target from [LOW — open] Bot creation discards the auto-issued Sleuth API keyLocation: Issue: Suggestion: Sharpen the warning to: [LOW — open] Scope read happens after version upload — read failure leaves stranded asset versionsLocation: Issue: For each asset, Suggestion: Read scopes before uploading versions so a scope-read failure aborts the asset cleanly with no half-state, or add a [LOW — open] `sleuthBotListSoftCap = 1000` warning isn't surfaced in the copy reportLocation: Issue: Suggestion: Either upgrade [LOW — open] Chunking paths (5000-event audit, 1000-event usage) still uncovered by testsLocation: Issue: Both tests seed 1-3 events, far below Suggestion: Add a focused unit test against · Branch |
44d867a to
82abfdb
Compare
|
Claude finished @detkin's task in 3m 33s —— View job PR Review: #152 — Add sx vault copyThe new MEDIUM from the prior review (bulk-scope fallback clobbering) is now confirmed fixed in [RESOLVED ✓] Bulk-scope fallback clobberingLocation: Verified: [RESOLVED ✓] Prior fixes confirmedVerified:
[LOW — open] Bot creation discards the auto-issued Sleuth API keyLocation: Issue: Suggestion: Sharpen the warning to: [LOW — open] Scope read still happens after version upload — read failure leaves stranded asset versionsLocation: Issue: For each asset, Suggestion: Read scopes before uploading versions so a scope-read failure aborts the asset cleanly with no half-state. Or add a [LOW — open] `sleuthBotListSoftCap = 1000` warning isn't surfaced in the copy reportLocation: Issue: Suggestion: Either upgrade [LOW — open] Chunking paths (5000-event audit, 1000-event usage) still uncovered by testsLocation: Issue: The round-trip test seeds 1 audit event and 1 usage event; the new Suggestion: Add a focused unit test against |
|
Claude finished @detkin's task in 3m 26s —— View job PR Review: #152 — Add sx vault copyCommit [MEDIUM — new] Path-scoped installs silently widen to whole-repo on copy from SleuthLocation: Issue: Concrete failure: source has Suggestion: Map [RESOLVED ✓] Bulk-scope fallback no longer clobbersVerified: [LOW — open] Audit/usage runtime warning still doesn't mention re-run duplicatesLocation: Issue: The command long-help ( Suggestion: Special-case the audit/usage stages with a sharper message, e.g. [LOW — open] Bot creation discards the auto-issued Sleuth API keyLocation: Issue: Suggestion: Sharpen the warning to include the cleanup instruction: [LOW — open] Scope read after version upload leaves stranded versions on transient read failureLocation: Issue: Each asset uploads every version first (lines 171-192), then reads scopes (line 194). If the scope read errors after uploads succeeded, the asset content is on the destination but no installation is recorded — a dangling, never-installed version set. The error becomes a warning and the report just shows "copied N assets". The slug-first fix removes the common silent-drop path but a network blip on the scope read still hits the same shape, and the new path-scope issue above can quietly compound this. Suggestion: Read scopes before uploading versions so a scope-read failure aborts the asset cleanly. Or add a [LOW — open] `sleuthBotListSoftCap = 1000` warning still doesn't surface in the copy reportLocation: Issue: Suggestion: Either upgrade [LOW — open] Chunking paths (5000-event audit, 1000-event usage) still uncovered by testsLocation: Issue: Both tests seed 1-3 events, well below Suggestion: Add a focused unit test against · Branch · |
- AssetInstallScopes resolves a repo install's mono-repo config to its include paths (via repository.monoRepoConfigs), emitting a path scope instead of silently widening to the whole repo
|
Claude finished @detkin's task in 3m 36s —— View job PR Review: #152 — Add sx vault copyBoth prior HIGH issues remain fixed and the prior-round MEDIUM (path scopes silently widening to whole-repo) is now resolved via [RESOLVED ✓] Path-scoped installs no longer widen to whole-repo on Sleuth → Sleuth copyLocation: Verified: [RESOLVED ✓] Prior fixes still in placeVerified:
[LOW — new] `monoRepoConfigName` is selected by `asset_installations.graphql` but never consumedLocation: Issue: The new install scope reader matches on Suggestion: Drop [LOW — new] `monoRepoConfigPaths` is N+1 across an asset's path-scoped installsLocation: Issue: For each path-scoped install row, Suggestion: Cache [LOW — open] Bot creation discards the auto-issued Sleuth API keyLocation: Issue: Suggestion: Sharpen the warning to: [LOW — open] Scope read still happens after version upload — read failure leaves stranded versionsLocation: Issue: For each asset, Suggestion: Read scopes before uploading versions so a scope-read failure aborts the asset cleanly, or add a [LOW — open] `sleuthBotListSoftCap = 1000` warning still isn't surfaced in the copy reportLocation: Issue: Suggestion: Either upgrade [LOW — open] Audit/usage runtime warning still doesn't mention re-run duplicatesLocation: Issue: The command long-help ( Suggestion: Special-case the audit/usage stages with a sharper message, e.g. [LOW — open] Chunking paths (5000-event audit, 1000-event usage) still uncovered by testsLocation: Issue: The round-trip test seeds 1 audit event + 1 usage event; the path-vault e2e seeds 2 audit events. Both well below Suggestion: Add a unit test against |
|
Claude finished @detkin's task in 2m 31s —— View job PR Review: #152 — Add sx vault copyThe most recent commit ( [RESOLVED ✓] `monoRepoConfigName` no longer selected from `AssetInstallations`Location: Verified: The selection set is now [RESOLVED ✓] Prior fixes still in placeVerified:
[LOW — open] Audit/usage runtime warning is still generic — operators won't see the duplicate-on-retry caveat at the moment of failureLocation: Issue: The command long-help ( Suggestion: Special-case the audit/usage stages with a sharper message, e.g. [LOW — open] Bot creation discards the auto-issued Sleuth API key — destination accumulates orphan keysLocation: Issue: Suggestion: Sharpen the warning to include the cleanup instruction: [LOW — open] Scope read still happens after version upload — read failure leaves stranded versionsLocation: Issue: For each asset, Suggestion: Read scopes before uploading versions so a scope-read failure aborts the asset cleanly with no half-state. Or add a [LOW — open] `monoRepoConfigPaths` is N+1 across an asset's path-scoped installsLocation: Issue: For each path-scoped install row, Suggestion: Cache [LOW — open] `sleuthBotListSoftCap = 1000` warning isn't surfaced in the copy reportLocation: Issue: Suggestion: Either upgrade [LOW — open] Chunking paths (5000-event audit, 1000-event usage) still uncovered by testsLocation: Issue: The round-trip test seeds 1 audit event + 1 usage event; the path-vault e2e seeds 2 audit events. Both far below Suggestion: Add a unit test against |
|
Claude finished @detkin's task in 3m 33s —— View job PR Review: #152 — Add sx vault copyCommit [MEDIUM] All-unresolved scope set leaves Sleuth dst with the auto-applied org-wide installLocation: Issue:
This is the same widening-by-omission the new Suggestion: When the bulk path ends up applying zero targets (either the explicit short-circuit at applied := len(targets) - len(unresolved)
if applied == 0 {
if err := dst.ClearAssetInstallations(ctx, name); err != nil {
r.warnf("clear auto-applied install on %q: %v", name, err)
}
return
}A test stub that returns all-unresolved would catch this regression in [LOW — open] Bot creation discards the auto-issued Sleuth API key — destination accumulates orphan keysLocation: Issue: Suggestion: Sharpen the warning to include the cleanup instruction: [LOW — open] Audit/usage runtime warning still doesn't mention re-run duplicatesLocation: Issue: The command long-help ( Suggestion: Special-case the audit/usage stages, e.g. [LOW — open] Scope read still happens after version upload — read failure leaves stranded versions and (on Sleuth) a widened installLocation: Issue: For each asset, Suggestion: Read scopes before uploading versions so a scope-read failure aborts the asset cleanly with no half-state. Or add a [LOW — open] `sleuthBotListSoftCap = 1000` warning isn't surfaced in the copy reportLocation: Issue: Suggestion: Either upgrade [LOW — open] `monoRepoConfigPaths` is N+1 across an asset's path-scoped installsLocation: Issue: For each path-scoped install row, Suggestion: Cache [LOW — open] Chunking paths (5000-event audit, 1000-event usage) still uncovered by testsLocation: Issue: The round-trip test seeds 1 audit event + 1 usage event; the path-vault e2e seeds 2 audit events. Both far below Suggestion: Add a unit test against |
- MEDIUM: clear auto-applied install when a present asset's scopes all fail to resolve (or the set errors), so it isn't left org-wide - Revoke the auto-issued bot API key on copy (no orphan keys) - Audit/usage failure warning notes re-run duplicates - Read asset scopes before uploading versions (no stranded uploads) - Surface the Sleuth bot-list cap in the report - Cache mono-repo configs per read (drop monoRepoConfigPaths N+1) - Extract chunkSlice helper for audit/usage chunking + unit test
|
Claude finished @detkin's task in 3m 15s —— View job PR Review: #152 — Add sx vault copyCommit [RESOLVED ✓] Clear-on-empty for all-unresolved (prior MEDIUM)Location: Verified: After the bulk call returns [RESOLVED ✓] Bot creation revokes the auto-issued Sleuth API keyLocation: Verified: After [RESOLVED ✓] Scopes read before version upload — no stranded versions on read failureLocation: Verified: The [RESOLVED ✓] Audit/usage failure warning mentions duplicate-on-retryLocation: Verified: When [RESOLVED ✓] Sleuth bot-list soft cap surfaced in the reportLocation: Verified: [RESOLVED ✓] monoRepoConfigPaths N+1 collapsed via per-call cacheLocation: Verified: [RESOLVED ✓] Chunking now has direct unit coverageLocation: Verified: Audit ( [LOW] `len(bots) >= sleuthBotListSoftCap` can warn falsely on a vault with exactly 1000 botsLocation: Issue: The heuristic fires whenever Suggestion: Soften the message to "source returned %d bots — this matches the server's list cap; if your vault has more, the extras were not copied". Or have the Sleuth side return a [LOW] Unsupported scope kinds on a present asset silently leave the auto-applied install in placeLocation: Issue: For a Suggestion: Treat if len(targets) == 0 {
clearAutoInstall(ctx, dst, name, r)
return
}Add a regression test that stubs
|
|
Claude finished @detkin's task in 1m 43s —— View job PR Review: #152 — Add sx vault copyCommit [RESOLVED ✓] Unmapped scope kinds no longer leave the auto-applied install in placeLocation: Verified: When every source scope fails [RESOLVED ✓] Bot soft-cap warning softened — no longer false-positive at exactly 1000Location: Verified: The warning now reads "source returned N bots, the most the server lists in one call; if the org has more, they were not copied" — fact-stating rather than asserting truncation. On a vault with exactly 1000 bots the operator gets a conditional notice rather than a wrong claim. (A real [RESOLVED ✓] All prior fixes from earlier rounds still in placeVerified:
[LOW] Unmapped-scope-clear branch has no direct test (symmetry gap with all-unresolved)Location: Issue: Suggestion: Add a test that stubs
|
Summary
Adds
sx vault copy --from <profile> --to <profile>— copy a whole vault (teams, bots, assets + all versions, installation scopes, audit history, usage history) between any backends (skills.new, git, path). Wires the Sleuth vault to the new GraphQL surface from sleuth-io/pulse#3954 and adds a backend-agnostic copy engine.installationsinput; team-repository management viaupdateTeam+ repo-GID resolution;GetUsageStats/ReadUsageEventsvia the newassetUsageEventsexport;RecordUsageEventssends actor;ImportAuditEvents; reads asset scopes from skills.new via theinstallationsquery.internal/vaultcopyengine: teams → bots → assets(+versions+scopes) → audit → usage. Best-effort (item failures become warnings); always previews, requires--yesto apply.assetAuditLogat the server's 50-cap (was a singlefirst=1000call); chunks audit-import and usage POSTs for large vaults.Depends on: sleuth-io/pulse#3954 (the GraphQL operations this calls). Already deployed to app.skills.new.
Testing
Exercised both directions against deployed skills.new with two real orgs (Docker): skills.new → git vault, then git vault → fresh org. Validated assets+versions, teams, bots, scopes, audit (15,885 events round-tripped), and usage (chunked). Round-trip unit test (path→path) covers losslessness + idempotency + org-wide assets.
Known limitations (candidate follow-ups)
Security implications of changes have been considered