feat: git sparse checkout mode for batch scanning (rebase of #1)#13
Merged
Conversation
Adds --clone flag to batch and discover commands, scanning repos via local git sparse checkout instead of the GitHub API. This avoids API rate limits when scanning large numbers of repos. Key changes: - internal/git: sparse clone package with concurrent clone-and-scan - Sliding star-count windows in FetchTopRepos to paginate beyond GitHub's 1,000-result search limit - Repo list caching in SQLite for --resume with --top N - SQLite hardening: single-conn serialization + busy_timeout for concurrent goroutine writes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Rebase of @SecKatie's #1 onto current
main— original PR was closed becausemainhad moved substantially (v0.7.0 splitinternal/scanner/→pkg/scanner/, platform additions, migration numbering collision). Authorship on the commit is preserved as Katie Mulliken — credit is hers.Summary
--cloneflag tobatchanddiscovercommands to scan repos via local git sparse checkout instead of the GitHub API, avoiding rate limits at scaleFetchTopReposto paginate beyond GitHub's 1,000-result search limit--resumewith--top Nskips re-fetchingbusy_timeout)Rebase resolution notes
cmd/fluxgate/main.go— kept--clone/--concurrency/--keepalongside the newer--tokensPAT-rotation flag; both work togetherinternal/github/batch.go— sliding-windowqueryparam +withRetryRotateso search picks up token rotation when presentinternal/store/migrations.go— originalmigration002AddRepoListscollided withmigration002Disclosuresthat landed on main; renumbered tomigration005AddRepoListsRepoList*API came through cleaninternal/gitpackage included as-isTest plan
CGO_ENABLED=0 go build ./...cleango test ./...passesbatch --top N --clone --resumeon the research station before mergeCredit: @SecKatie — thanks for the patience on the rebase, this is a solid contribution. Closes the work originally proposed in #1.