Governance & consolidation package + repos.yaml reconciliation#26
Conversation
- platform -> perfops-platform (live monorepo); reportkit/perfops-sdk noted as subdirs - add `agents` product line (perfops-agents, mcp-perfops, agent-config) - add vault (documentation) and perfops-testing (launch_gate umbrella) - move now-archived repos into archived: section - drop absent demo repos See governance/reports/manifest-drift.md for rationale.
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive repository governance and consolidation package under the governance/ directory, including inventory CSVs, security and duplication reports, and executable scripts to automate the transition to a single monorepo topology. It also reconciles the repos.yaml manifest and adds an organization-wide SECURITY.md. Feedback on the changes highlights a critical bug in merge.sh where a failed clone can lead to executing commands in the wrong directory, a dangerous 64KB size threshold in delete.sh that risks deleting small active repos, a schema mismatch in inventory.sh that overwrites manually curated columns, and an opportunity to optimize default branch resolution in merge.sh using the GitHub CLI.
| if [ "$APPLY" = "true" ]; then | ||
| rm -rf "$WORKDIR"; mkdir -p "$WORKDIR" | ||
| log "Cloning seed $OWNER/$SEED into $WORKDIR/perfops ..." | ||
| git clone "$GH_BASE/$SEED.git" "$WORKDIR/perfops" | ||
| fi | ||
| cd "$WORKDIR/perfops" 2>/dev/null || { warn "(dry-run: seed not cloned)"; } |
There was a problem hiding this comment.
Critical Bug: Script executes destructive Git commands in the wrong directory if clone fails
If APPLY is true but the git clone command fails (due to network issues, invalid permissions, or missing repository), the directory $WORKDIR/perfops will not exist.
Because of the || operator on line 49:
cd "$WORKDIR/perfops" 2>/dev/null || { warn "(dry-run: seed not cloned)"; }
the failure of the cd command is caught, bypassing set -e. The script will print the warning and continue executing the rest of the script (including git remote add, git fetch, and git subtree add) in the current working directory (which is likely the user's local clone of the .github or parent repository). This will corrupt the user's local repository with unwanted remotes and subtree merges.
We must ensure the script exits immediately if cd fails during an actual apply run.
| if [ "$APPLY" = "true" ]; then | |
| rm -rf "$WORKDIR"; mkdir -p "$WORKDIR" | |
| log "Cloning seed $OWNER/$SEED into $WORKDIR/perfops ..." | |
| git clone "$GH_BASE/$SEED.git" "$WORKDIR/perfops" | |
| fi | |
| cd "$WORKDIR/perfops" 2>/dev/null || { warn "(dry-run: seed not cloned)"; } | |
| if [ "$APPLY" = "true" ]; then | |
| rm -rf "$WORKDIR"; mkdir -p "$WORKDIR" | |
| log "Cloning seed $OWNER/$SEED into $WORKDIR/perfops ..." | |
| git clone "$GH_BASE/$SEED.git" "$WORKDIR/perfops" | |
| cd "$WORKDIR/perfops" | |
| else | |
| cd "$WORKDIR/perfops" 2>/dev/null || log "(dry-run: seed not cloned)" | |
| fi |
| [ "$archived" = "true" ] || { warn "$repo: not archived — archive first, don't delete"; return 1; } | ||
| [ "$size" -le 64 ] || { warn "$repo: size=${size}KB (not empty) — archive instead"; return 1; } |
There was a problem hiding this comment.
Dangerous Size Threshold for Repository Deletion
The script allows deleting any repository with a size up to 64KB.
While 64KB seems small, many highly valuable, fully functional repositories containing scripts, configurations, or small microservices are well under this limit. For example:
order-lab(Go/Kafka microservice) is only 6KB.nfr-library(SLO catalog) is only 12KB.chaos-scenarios(Chaos engineering) is only 13KB.
If any of these repositories are archived (which is planned as part of this consolidation) and accidentally added to TO_DELETE, they will pass this safety check and be permanently and irreversibly deleted.
To prevent accidental data loss, the size threshold should be set to 0 (or a very low number like 1 or 2 KB) to ensure only truly empty repositories can be deleted.
| [ "$archived" = "true" ] || { warn "$repo: not archived — archive first, don't delete"; return 1; } | |
| [ "$size" -le 64 ] || { warn "$repo: size=${size}KB (not empty) — archive instead"; return 1; } | |
| [ "$archived" = "true" ] || { warn "$repo: not archived — archive first, don't delete"; return 1; } | |
| [ "$size" -eq 0 ] || { warn "$repo: size=${size}KB (not empty) — archive instead"; return 1; } |
| # repos.csv | ||
| { | ||
| echo "name,visibility,archived,fork,created_at,pushed_at,default_branch,stars,watchers,forks,open_issues,language,size_kb,license,description" | ||
| echo "$repos_json" | jq -r '[.name, | ||
| (if .private then "private" else "public" end), | ||
| .archived, .fork, (.created_at|.[0:10]), (.pushed_at|.[0:10]), | ||
| .default_branch, .stars, .watchers, .forks, .open_issues, | ||
| (.language // ""), .size, .license, | ||
| ((.description // "") | gsub("[\n,]";" "))] | @csv' | ||
| } > "$OUT_DIR/repos.csv" |
There was a problem hiding this comment.
Schema Mismatch and Data Loss in Inventory Regeneration
The inventory.sh script is completely out of sync with the actual CSV files checked into the repository (repos.csv, actions.csv, and security.csv).
-
Schema Mismatches:
repos.csvin the repo has 20 columns (includingclassification,topics,has_readme,branch_count,workflow_count), but this script only generates 15 columns.actions.csvin the repo has 6 columns (includingactions_enabled,dead_workflow_signals), but this script only generates 4 columns.security.csvin the repo has 11 columns (includinghas_license,has_security_md,has_codeowners,secret_scanning,gitleaks_in_repo,risk_notes), but this script only generates 5 columns.
-
Data Destruction:
- Using the redirection operator
>directly truncates the target files. This will completely wipe out all manually curated columns (such asclassificationinrepos.csvandrisk_notesinsecurity.csv), despite the log message claiming that the classification column is not overwritten.
- Using the redirection operator
To fix this, the script should either merge the live API data with the existing CSV files to preserve manually curated columns and extra fields, or at least generate the full set of columns and avoid overwriting curated fields.
| gh pr create --repo "$OWNER/$SEED" --base "$(git remote show origin \ | ||
| | sed -n 's/.*HEAD branch: //p')" --head "$branch" --draft \ |
There was a problem hiding this comment.
Inefficient and Fragile Default Branch Resolution
Using git remote show origin to find the default branch is slow and inefficient because it performs an active network request to query the remote repository.
Since the GitHub CLI (gh) is already required and authenticated, we can query the default branch much faster and more reliably using gh repo view.
| gh pr create --repo "$OWNER/$SEED" --base "$(git remote show origin \ | |
| | sed -n 's/.*HEAD branch: //p')" --head "$branch" --draft \ | |
| gh pr create --repo "$OWNER/$SEED" --base "$(gh repo view "$OWNER/$SEED" --json defaultBranchRef --jq .defaultBranchRef.name)" --head "$branch" --draft \ |
…up, delete.sh size=0, inventory.sh writes to raw/ - merge.sh: cd into clone inside the apply branch so set -e aborts on a failed clone instead of running git in the wrong cwd (gemini critical) - merge.sh: resolve PR base via `gh repo view` instead of `git remote show` - delete.sh: only size==0 repos are deletable (was <=64KB; tiny live repos exist) - inventory.sh: write raw API snapshot to inventory/raw/, never clobber the curated CSVs (classification/risk_notes/deep-scan columns)
What
Adds a complete repository-governance package under
governance/plus anorg-wide
SECURITY.mdand a one-tap Actions workflow, and reconcilesrepos.yamlwith the live inventory.Grounded in a read-only scan of all 46 repos (22 active, 24 archived).
Nothing was archived/deleted/transferred — those are delivered as reviewable,
dry-run-by-default scripts.
Inventory (
governance/inventory/)repos.csv— 46 repos + classification (PROD 8 · ACTIVE 12 · LAB 2 · ARCHIVE 24)security.csv— visibility, baseline files, collaboratorsactions.csv— workflows + dead-workflow signalsReports (
governance/reports/)security-findings.md,security-hardening.md,duplications.md,archive-candidates.md,manifest-drift.mdOperations (
governance/operations/)inventory.sh,merge.sh(git-subtree collapse, history-preserved),harden.sh,archive.sh,migrate.sh,delete.sh— all dry-run by default,shellcheck-clean.
delete.shis guarded byCONFIRM_DELETE=yesand an emptytarget list.
.github/workflows/governance-ops.yml— run any operation from the Actionstab (no local shell; works from an iPad). Requires a one-time
GOV_PATsecret.Decisions baked in
perfopsmonorepo; absorbed standalones getarchived.
perfops-consulting-site(public),architecture,.github,vault, and personal repos stay separate. Seegovernance/README.md.Key findings
perfops-consulting-sitehas 2 external write collaborators + 131 branches.repos.yamldrift fixed in this PR.Limitations (honest)
Archiving/transferring/deleting repos, creating orgs, editing settings, and
adding secrets cannot be done via the available API tooling. They run either
as one tap in the Actions workflow (after adding
GOV_PAT) or as clicks in theGitHub UI — see the checklist in
governance/README.md.Rollback
Close this PR. All operations are reversible (archive↔unarchive, transfer back);
delete targets only empty repos behind a confirmation guard.
https://claude.ai/code/session_01XmPtu6BAhYCMX2SZXfbe76
Generated by Claude Code