Skip to content

Governance & consolidation package + repos.yaml reconciliation#26

Merged
muntianus merged 7 commits into
mainfrom
claude/github-repo-consolidation-JhUyc
May 29, 2026
Merged

Governance & consolidation package + repos.yaml reconciliation#26
muntianus merged 7 commits into
mainfrom
claude/github-repo-consolidation-JhUyc

Conversation

@muntianus
Copy link
Copy Markdown
Owner

What

Adds a complete repository-governance package under governance/ plus an
org-wide SECURITY.md and a one-tap Actions workflow, and reconciles
repos.yaml with the live inventory.

Grounded in a read-only scan of all 46 repos (22 active, 24 archived).
Nothing was archived/deleted/transferred — those are delivered as reviewable,
dry-run-by-default scripts.

Inventory (governance/inventory/)

  • repos.csv — 46 repos + classification (PROD 8 · ACTIVE 12 · LAB 2 · ARCHIVE 24)
  • security.csv — visibility, baseline files, collaborators
  • actions.csv — workflows + dead-workflow signals

Reports (governance/reports/)

  • security-findings.md, security-hardening.md, duplications.md,
    archive-candidates.md, manifest-drift.md

Operations (governance/operations/)

  • inventory.sh, merge.sh (git-subtree collapse, history-preserved),
    harden.sh, archive.sh, migrate.sh, delete.sh — all dry-run by default,
    shellcheck-clean. delete.sh is guarded by CONFIRM_DELETE=yes and an empty
    target list.
  • .github/workflows/governance-ops.yml — run any operation from the Actions
    tab
    (no local shell; works from an iPad). Requires a one-time GOV_PAT secret.

Decisions baked in

  • Maximum collapse → single perfops monorepo; absorbed standalones get
    archived. perfops-consulting-site (public), architecture, .github,
    vault, and personal repos stay separate. See governance/README.md.

Key findings

  1. No server-side secret scanning (no GHAS) — mitigated by gitleaks + security-gate.
  2. 2 public repos; perfops-consulting-site has 2 external write collaborators + 131 branches.
  3. Two umbrella monorepos duplicate standalone repos (the collapse target).
  4. repos.yaml drift fixed in this PR.

Limitations (honest)

Archiving/transferring/deleting repos, creating orgs, editing settings, and
adding secrets cannot be done via the available API tooling. They run either
as one tap in the Actions workflow (after adding GOV_PAT) or as clicks in the
GitHub UI — see the checklist in governance/README.md.

Rollback

Close this PR. All operations are reversible (archive↔unarchive, transfer back);
delete targets only empty repos behind a confirmation guard.

https://claude.ai/code/session_01XmPtu6BAhYCMX2SZXfbe76


Generated by Claude Code

muntianus added 5 commits May 30, 2026 01:48
- platform -> perfops-platform (live monorepo); reportkit/perfops-sdk noted as subdirs
- add `agents` product line (perfops-agents, mcp-perfops, agent-config)
- add vault (documentation) and perfops-testing (launch_gate umbrella)
- move now-archived repos into archived: section
- drop absent demo repos
See governance/reports/manifest-drift.md for rationale.
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive repository governance and consolidation package under the governance/ directory, including inventory CSVs, security and duplication reports, and executable scripts to automate the transition to a single monorepo topology. It also reconciles the repos.yaml manifest and adds an organization-wide SECURITY.md. Feedback on the changes highlights a critical bug in merge.sh where a failed clone can lead to executing commands in the wrong directory, a dangerous 64KB size threshold in delete.sh that risks deleting small active repos, a schema mismatch in inventory.sh that overwrites manually curated columns, and an opportunity to optimize default branch resolution in merge.sh using the GitHub CLI.

Comment thread governance/operations/merge.sh Outdated
Comment on lines +44 to +49
if [ "$APPLY" = "true" ]; then
rm -rf "$WORKDIR"; mkdir -p "$WORKDIR"
log "Cloning seed $OWNER/$SEED into $WORKDIR/perfops ..."
git clone "$GH_BASE/$SEED.git" "$WORKDIR/perfops"
fi
cd "$WORKDIR/perfops" 2>/dev/null || { warn "(dry-run: seed not cloned)"; }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Critical Bug: Script executes destructive Git commands in the wrong directory if clone fails

If APPLY is true but the git clone command fails (due to network issues, invalid permissions, or missing repository), the directory $WORKDIR/perfops will not exist.

Because of the || operator on line 49:
cd "$WORKDIR/perfops" 2>/dev/null || { warn "(dry-run: seed not cloned)"; }
the failure of the cd command is caught, bypassing set -e. The script will print the warning and continue executing the rest of the script (including git remote add, git fetch, and git subtree add) in the current working directory (which is likely the user's local clone of the .github or parent repository). This will corrupt the user's local repository with unwanted remotes and subtree merges.

We must ensure the script exits immediately if cd fails during an actual apply run.

Suggested change
if [ "$APPLY" = "true" ]; then
rm -rf "$WORKDIR"; mkdir -p "$WORKDIR"
log "Cloning seed $OWNER/$SEED into $WORKDIR/perfops ..."
git clone "$GH_BASE/$SEED.git" "$WORKDIR/perfops"
fi
cd "$WORKDIR/perfops" 2>/dev/null || { warn "(dry-run: seed not cloned)"; }
if [ "$APPLY" = "true" ]; then
rm -rf "$WORKDIR"; mkdir -p "$WORKDIR"
log "Cloning seed $OWNER/$SEED into $WORKDIR/perfops ..."
git clone "$GH_BASE/$SEED.git" "$WORKDIR/perfops"
cd "$WORKDIR/perfops"
else
cd "$WORKDIR/perfops" 2>/dev/null || log "(dry-run: seed not cloned)"
fi

Comment thread governance/operations/delete.sh Outdated
Comment on lines +48 to +49
[ "$archived" = "true" ] || { warn "$repo: not archived — archive first, don't delete"; return 1; }
[ "$size" -le 64 ] || { warn "$repo: size=${size}KB (not empty) — archive instead"; return 1; }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Dangerous Size Threshold for Repository Deletion

The script allows deleting any repository with a size up to 64KB.

While 64KB seems small, many highly valuable, fully functional repositories containing scripts, configurations, or small microservices are well under this limit. For example:

  • order-lab (Go/Kafka microservice) is only 6KB.
  • nfr-library (SLO catalog) is only 12KB.
  • chaos-scenarios (Chaos engineering) is only 13KB.

If any of these repositories are archived (which is planned as part of this consolidation) and accidentally added to TO_DELETE, they will pass this safety check and be permanently and irreversibly deleted.

To prevent accidental data loss, the size threshold should be set to 0 (or a very low number like 1 or 2 KB) to ensure only truly empty repositories can be deleted.

Suggested change
[ "$archived" = "true" ] || { warn "$repo: not archived — archive first, don't delete"; return 1; }
[ "$size" -le 64 ] || { warn "$repo: size=${size}KB (not empty) — archive instead"; return 1; }
[ "$archived" = "true" ] || { warn "$repo: not archived — archive first, don't delete"; return 1; }
[ "$size" -eq 0 ] || { warn "$repo: size=${size}KB (not empty) — archive instead"; return 1; }

Comment on lines +25 to +34
# repos.csv
{
echo "name,visibility,archived,fork,created_at,pushed_at,default_branch,stars,watchers,forks,open_issues,language,size_kb,license,description"
echo "$repos_json" | jq -r '[.name,
(if .private then "private" else "public" end),
.archived, .fork, (.created_at|.[0:10]), (.pushed_at|.[0:10]),
.default_branch, .stars, .watchers, .forks, .open_issues,
(.language // ""), .size, .license,
((.description // "") | gsub("[\n,]";" "))] | @csv'
} > "$OUT_DIR/repos.csv"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Schema Mismatch and Data Loss in Inventory Regeneration

The inventory.sh script is completely out of sync with the actual CSV files checked into the repository (repos.csv, actions.csv, and security.csv).

  1. Schema Mismatches:

    • repos.csv in the repo has 20 columns (including classification, topics, has_readme, branch_count, workflow_count), but this script only generates 15 columns.
    • actions.csv in the repo has 6 columns (including actions_enabled, dead_workflow_signals), but this script only generates 4 columns.
    • security.csv in the repo has 11 columns (including has_license, has_security_md, has_codeowners, secret_scanning, gitleaks_in_repo, risk_notes), but this script only generates 5 columns.
  2. Data Destruction:

    • Using the redirection operator > directly truncates the target files. This will completely wipe out all manually curated columns (such as classification in repos.csv and risk_notes in security.csv), despite the log message claiming that the classification column is not overwritten.

To fix this, the script should either merge the live API data with the existing CSV files to preserve manually curated columns and extra fields, or at least generate the full set of columns and avoid overwriting curated fields.

Comment thread governance/operations/merge.sh Outdated
Comment on lines +81 to +82
gh pr create --repo "$OWNER/$SEED" --base "$(git remote show origin \
| sed -n 's/.*HEAD branch: //p')" --head "$branch" --draft \
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Inefficient and Fragile Default Branch Resolution

Using git remote show origin to find the default branch is slow and inefficient because it performs an active network request to query the remote repository.

Since the GitHub CLI (gh) is already required and authenticated, we can query the default branch much faster and more reliably using gh repo view.

Suggested change
gh pr create --repo "$OWNER/$SEED" --base "$(git remote show origin \
| sed -n 's/.*HEAD branch: //p')" --head "$branch" --draft \
gh pr create --repo "$OWNER/$SEED" --base "$(gh repo view "$OWNER/$SEED" --json defaultBranchRef --jq .defaultBranchRef.name)" --head "$branch" --draft \

muntianus added 2 commits May 30, 2026 02:00
…up, delete.sh size=0, inventory.sh writes to raw/

- merge.sh: cd into clone inside the apply branch so set -e aborts on a failed
  clone instead of running git in the wrong cwd (gemini critical)
- merge.sh: resolve PR base via `gh repo view` instead of `git remote show`
- delete.sh: only size==0 repos are deletable (was <=64KB; tiny live repos exist)
- inventory.sh: write raw API snapshot to inventory/raw/, never clobber the
  curated CSVs (classification/risk_notes/deep-scan columns)
@muntianus muntianus marked this pull request as ready for review May 29, 2026 23:20
@muntianus muntianus merged commit f0373ab into main May 29, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant