Skip to content

feat: Add GitHub Actions Runner Scale Set integration#253

Open
whywaita wants to merge 6 commits into
masterfrom
feat-scale-set-client
Open

feat: Add GitHub Actions Runner Scale Set integration#253
whywaita wants to merge 6 commits into
masterfrom
feat-scale-set-client

Conversation

@whywaita
Copy link
Copy Markdown
Owner

@whywaita whywaita commented Feb 11, 2026

Summary

  • Implement scale set mode (ref) as an alternative to webhook mode
  • Add support for long-polling GitHub Scale Set API
  • Enable faster runner provisioning with JIT (Just-In-Time) configuration

Changes

  • New package pkg/scaleset: Manager, scaler, JIT script generation, and metrics
  • Config extension: Add SCALESET_* environment variables for scale set configuration
  • Server integration: Conditional startup based on SCALESET_ENABLED flag
  • Documentation: Comprehensive guide for scale set mode in docs/scaleset-mode.md
  • Tests: Full test coverage for scaleset package (19 test cases)

Key Components

  1. Scale Set Manager (pkg/scaleset/manager.go): Manages scale set lifecycle for all targets
  2. Scaler (pkg/scaleset/scaler.go): Implements listener.Scaler interface for runner provisioning
  3. JIT Scripts (pkg/scaleset/scripts.go): Generates simplified setup scripts for JIT runners
  4. Metrics (pkg/scaleset/metrics.go): 5 Prometheus metrics for monitoring scale set operations

Architecture Changes

Webhook Mode (existing):

GitHub webhook → myshoes → job queue → starter loop → shoes plugin → runner

Scale Set Mode (new):

myshoes scale set manager → long-poll Scale Set API → JIT config → shoes plugin → runner

Configuration

Enable scale set mode with environment variables:

SCALESET_ENABLED=true              # Enable scale set mode (default: false)
SCALESET_RUNNER_GROUP=default      # Runner group name (default: "default")
SCALESET_MAX_RUNNERS=10            # Max runners per scale set (default: 10)
SCALESET_NAME_PREFIX=myshoes       # Scale set name prefix (default: "myshoes")

GitHub App Permissions

New requirement for Organization-level targets:

  • organization_self_hosted_runners: Read & Write

Repository-level targets continue to work with existing permissions.

Test Plan

  • Unit tests for scaleset package: go test ./pkg/scaleset/...
  • All existing tests pass: go test ./...
  • Build succeeds: go build ./...
  • Code formatted: go fmt ./...

Manual Testing

To test scale set mode:

  1. Set SCALESET_ENABLED=true and other SCALESET_* variables
  2. Ensure GitHub App has organization_self_hosted_runners permission (for org-level targets)
  3. Start myshoes and verify scale set creation in logs
  4. Trigger a workflow job and verify runner provisioning
  5. Check /metrics endpoint for myshoes_scaleset_* metrics

Notes

  • Backward compatible: Webhook mode remains default (no breaking changes)
  • Shoes plugin compatible: No proto changes required; existing plugins work as-is
  • JIT benefits: Eliminates registration token and config.sh steps for faster startup
  • GHES support: Works with GitHub Enterprise Server via GITHUB_URL configuration
  • Mode exclusivity: Webhook and scale set modes are mutually exclusive (global switch)

For detailed documentation, see docs/scaleset-mode.md.

Add support for scale set mode as an alternative to webhook mode.
Scale set mode uses long-polling to GitHub Scale Set API instead of
receiving webhook events, enabling faster runner provisioning with
JIT (Just-In-Time) configuration.

Key changes:
- Add scaleset package with manager, scaler, and JIT script generation
- Add SCALESET_* environment variables for configuration
- Integrate scale set manager into server startup
- Add comprehensive tests and documentation
- Support both webhook and scale set modes via SCALESET_ENABLED flag

prompt: Implement the following plan:

# Scale Set Client Integration Plan
...
----
了解した。Scale Set Client統合プランの実装を開始する。

段階的に実装を進める。まず基盤となるConfigから着手し、scaleset パッケージを構築、Server統合、テスト、ドキュメントの順序で進行する。

## Step 1: 依存追加
...
----
(implementation details)
Fix four critical issues identified in code review:

P1 Issues:
1. Remove stopped listeners from scaler map
   - Add defer cleanup in listener goroutine to remove from m.scalers
   - Prevents stale entries that block target restart after errors

2. Disable webhook enqueue in scale set mode
   - Add scaleSetEnabled parameter to NewMux
   - Skip /github/events registration when scale set mode enabled
   - Prevents job queue accumulation from unused webhooks

3. Preserve runner state on instance deletion failure
   - Return error when DeleteInstance fails
   - Keep runner in activeRunners for retry on transient errors
   - Prevents orphaned instances from untracked failures

P2 Issues:
4. Refresh scaler when target config changes
   - Add targetConfigChanged() to detect config updates
   - Restart listener when ResourceType/ProviderURL/Status changes
   - Ensures API-driven target updates take effect immediately

All tests pass and builds successfully.
…b completion

Guard deferred cleanup in startListener from deleting a replacement
wrapper stored by syncTargets during config-change restarts. The defer
now checks pointer identity before removing the map entry.

Fall back to datastore lookup in HandleJobCompleted when activeRunners
(in-memory) does not contain the runner, preventing instance and
datastore row leaks after process restarts or listener re-creation.
The GitHub API returns a 400 ArgumentNullException when creating scale
sets because Label.Type was set to invalid enum values ("scaleset",
"scope"). Leave Type empty so the library defaults to "System", matching
the official actions/scaleset examples.

Also adds RunnerSetting{DisableUpdate: true} and removes the redundant
RunnerGroupName field.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant