campaign(2k-stars): community health + broader positioning + docs/ + GitHub Action by hoainho · Pull Request #30 · nano-step/eval-harness

hoainho · 2026-06-01T12:51:13Z

What does this PR do?

End-to-end execution of Phase 1 (foundation) + Phase 2 (awesome-list PR wave) + Phase 3 partial (launch content + GitHub Action + KPI script) of the 2000-star contributor-attraction campaign.

Pure additions + a README pitch rewrite — no source-code changes, no test impact.

Why?

The repo has world-class internals but very few discovery surfaces. This PR adds the surfaces (badges, docs, action, issue templates, KPI tracking) and broadens the README pitch from "opencode-skill testing" → "behavior-regression testing for LLM agents (opencode runner today, more coming)".

What's in this PR (2 commits, 35 files, ~3000 lines)

Commit 1 — community health + docs + GitHub Action

Surface	Before	After
GitHub topics	none	14 (`llm-evaluation`, `ai-agents`, `regression-testing`, `opencode`, `llmops`, `claude`, `anthropic`, ...)
GitHub description	"Behavior-regression eval harness for opencode skills..."	"Behavior-regression testing for LLM agents. 4-class attribution, 6-field FAIL schema..."
Discussions	disabled	enabled, 3 seed threads (#27, #28, #29)
README badges	none	npm / license / tests / stars / discussions / issues / good-first-issues
Hero demo	none	`docs/assets/demo.tape` (vhs script) + README GIF placeholder
Comparison docs	none	concepts.md, comparison.md, runners.md, why-not-promptfoo.md, docs/README.md
Community health	CONTRIBUTING.md only	+ CODE_OF_CONDUCT, SECURITY, FUNDING, 3 issue templates, PR template
CI integration	none shipped	`.github/actions/eval-harness` composite action + example workflow

Commit 2 — launch content + KPI + awesome-PR plans + handoff

Surface	Content
`.campaign/posts/`	HN Show post, Reddit (LocalLLaMA + ClaudeAI + mlops), 3 blog posts, X thread — all with response playbooks
`.campaign/awesome-pr-bodies/`	Step-by-step submission guides for 3 active lists + revisit-later for the rest. Pre-flight check found 4 dead lists I'm explicitly skipping.
`scripts/eval/tools/stars-kpi.sh`	Weekly KPI snapshot — stars, forks, watchers, contributors, unique authors (30d), traffic (views/clones 14d), top referrers, awesome-list star-floor tracker. Tested live: works.
`.campaign/CAMPAIGN.md`	12-month handoff doc: 14-day critical path, weekly cadence, milestones at 100/500/1000/2000 stars, anti-patterns, what the agent can do in follow-up sessions vs what only humans can do.

Awesome-list PRs opened today (outside this PR, under `nano-step` org)

taishi-i/awesome-ChatGPT-repositories #150 — active list, no floor
tensorchord/Awesome-LLMOps #538 — active, 5.8k stars
steven2358/awesome-generative-ai #830 — active, 12k stars

How is it tested?

docs-only + config-only + action-only + tooling-only.

The existing 20 test suites under scripts/eval/tests/ are untouched and still green (verify locally with for t in scripts/eval/tests/*.sh; do bash "$t"; done).

The GitHub Action itself is a thin composite over the existing CLI; end-to-end smoke happens when consumed by a downstream repo.

The KPI script was test-run against the live repo and produced expected output (4 stars, 17 views/14d baseline).

Before / after evidence

See the table above. The repo went from "0 surfaces for discovery" → "every surface a stranger needs to evaluate, contribute to, or integrate with eval-harness."

Issues created alongside this PR

5 good first issues: Add an expect_exact_lines variant to kind: shell #31 (expect_exact_lines), Add eval-harness doctor — preflight diagnostics command #32 (doctor command, pinned), Document the case YAML with a JSON Schema (for editor autocomplete) #33 (JSON Schema), Pretty-print eval-harness status output as a colored table #34 (status TTY table), Add --since filter to eval-harness trend (e.g. --since=7d, --since=2026-05-01) #35 (--since for trend)
LangGraph runner issue: Add a langgraph-node runner for regression-testing LangGraph agents #36 (pinned, help wanted, runner label)
Existing issues labeled with good first issue / help wanted / security / portability / ci-integration / documentation

Recommended next step

Merge this PR. Then follow the 14-day critical path in .campaign/CAMPAIGN.md. The campaign is set up; the next 12 months are execution.

Checklist

No test suites touched — N/A
CHANGELOG.md will be updated as part of v0.4.3 (this is meta-campaign work, not a release)
Read CONTRIBUTING.md
No score.sh or attribute.sh changes — GNU/BSD grep check N/A

…Action - README: badges row, broader-positioning hero ('regression testing for LLM agents'), hero GIF placeholder + docs/* learn-more links. - .github/: CODE_OF_CONDUCT.md, SECURITY.md, FUNDING.yml, issue templates (bug / feature / case-recipe), PR template, ISSUE_TEMPLATE/config.yml. - docs/: concepts.md (4 core ideas), comparison.md (vs promptfoo / DeepEval / Ragas / OpenAI Evals), runners.md (runner abstraction + langgraph/claude-agent-sdk roadmap), why-not-promptfoo.md (direct head-to-head), docs/README.md index, docs/assets/demo.tape (vhs script). - .github/actions/eval-harness/: composite GitHub Action (action.yml + README.md) — installs jq/yq/opencode/eval-harness, runs against changed skills, posts job summary with 6-field FAIL, uploads runs/ artifact, exit 12 on regression. Action README with quickstart + inputs/outputs + examples + pinning + marketplace publishing guide. - .github/workflows/eval-example.yml: example PR/push integration. No source-code changes. No test impact. Part of the campaign/2k-stars roadmap. See [internal doc].

gemini-code-assist

Code Review

This pull request introduces comprehensive repository configuration, templates, and documentation, including issue templates, a pull request template, a Code of Conduct, a Security Policy, and detailed conceptual and comparison guides. It also adds a composite GitHub Action (.github/actions/eval-harness/action.yml) to run behavior-regression testing in CI. The review feedback highlights critical improvements for the GitHub Action: preventing pipeline failures when no skills are changed by replacing grep with awk filtering, handling empty BASE_SHA values on new branch pushes to avoid git diff crashes, and explicitly setting EVAL_STATE_DIR to ensure evaluation runs are written to the local workspace for step summaries and artifact uploads.

gemini-code-assist · 2026-06-01T12:53:00Z

+        CHANGED_SKILLS=$(git diff --name-only "$BASE_SHA" "$HEAD_SHA" \
+          | grep -E '^.opencode/skills/[^/]+/' \
+          | awk -F/ '{print $3}' \
+          | sort -u \
+          | paste -sd "," -)


Because set -euo pipefail is enabled, if grep finds no matching lines (which happens whenever a commit/PR does not touch any files in .opencode/skills/), it will exit with status 1, causing the entire pipeline and the workflow step to fail. We can avoid this and simplify the pipeline by performing the path filtering directly inside awk.

CHANGED_SKILLS=$(git diff --name-only "$BASE_SHA" "$HEAD_SHA" \\ | awk -F/ '$1 == ".opencode" && $2 == "skills" && $3 != "" {print $3}' \\ | sort -u \\ | paste -sd "," -)

gemini-code-assist · 2026-06-01T12:53:00Z

+        if [[ "${{ github.event_name }}" == "pull_request" ]]; then
+          BASE_SHA="${{ github.event.pull_request.base.sha }}"
+          HEAD_SHA="${{ github.event.pull_request.head.sha }}"
+        else
+          BASE_SHA="${{ github.event.before }}"
+          HEAD_SHA="${{ github.sha }}"
+        fi


In a push event, if a new branch is pushed, github.event.before is set to 0000000000000000000000000000000000000000. Passing this to git diff will cause a fatal error (fatal: bad object ...) and crash the workflow. We should handle this by falling back to HEAD~1 if it exists, or to HEAD_SHA itself (resulting in an empty diff).

if [[ "${{ github.event_name }}" == "pull_request" ]]; then BASE_SHA="${{ github.event.pull_request.base.sha }}" HEAD_SHA="${{ github.event.pull_request.head.sha }}" else BASE_SHA="${{ github.event.before }}" HEAD_SHA="${{ github.sha }}" if [[ "$BASE_SHA" == "0000000000000000000000000000000000000000" || -z "$BASE_SHA" ]]; then if git rev-parse --verify HEAD~1 >/dev/null 2>&1; then BASE_SHA="HEAD~1" else BASE_SHA="$HEAD_SHA" fi fi fi

gemini-code-assist · 2026-06-01T12:53:00Z

+      env:
+        ANTHROPIC_API_KEY: ${{ inputs.anthropic-api-key }}
+        EVAL_BUDGET_USD: ${{ inputs.budget-usd }}
+        EVAL_CI: "1"


The action expects the evaluation runs to be written to the local ./runs/ directory for step summary generation and artifact uploading. However, since EVAL_STATE_DIR is not set in the environment, eval-harness defaults to writing runs to $HOME/.config/opencode/eval-harness/runs/. This causes the action to find no runs in the workspace, leading to empty step summaries and missing artifacts. Setting EVAL_STATE_DIR to . resolves this.

env: ANTHROPIC_API_KEY: ${{ inputs.anthropic-api-key }} EVAL_BUDGET_USD: ${{ inputs.budget-usd }} EVAL_CI: "1" EVAL_STATE_DIR: "."

…dies, handoff doc - .campaign/posts/01..07: HN Show post, 3 Reddit posts (LocalLLaMA, ClaudeAI, mlops), 3 blog posts (4-class attribution, 6-field FAIL, flaky-tests), 8-tweet X thread. Each includes a response playbook for likely comments. - .campaign/awesome-pr-bodies/: per-list step-by-step submission guides + PR body templates. Pre-flight check found 4 dead/wrong-fit lists (Hannibal046/Awesome-LLM unmerged since 2025-07; visenger/awesome-mlops unmerged since 2024; e2b-dev/awesome-sdks-for-ai-agents dead since 2023; e2b-dev/awesome-ai-agents redirects tools elsewhere). 3 PRs opened today to active lists. - scripts/eval/tools/stars-kpi.sh: read-only weekly KPI snapshot. Captures stars, forks, watchers, contributors, unique authors (30d), traffic (views/clones 14d), top referrers, top paths. Appends to ~/.eval-harness/kpi-history.ndjson. Prints awesome-list star-floor milestone tracker (target deferred PR thresholds). - .campaign/README.md: layout + sequencing. - .campaign/CAMPAIGN.md: 12-month handoff. Critical path day-by-day for first 14 days, weekly cadence months 1-6, success milestones at 100/500/1000/2000 stars, anti-patterns to avoid, what the agent can do in follow-up sessions and what only humans can do. Awesome-list PRs opened today: - taishi-i/awesome-ChatGPT-repositories #150 - tensorchord/Awesome-LLMOps #538 - steven2358/awesome-generative-ai #830 No source-code changes.

… PR #405)

…+ verified merge-rate table Signed-off-by: Hoài Nhớ <nhoxtvt@gmail.com>

…ct_ai 297-line example) Signed-off-by: Hoài Nhớ <nhoxtvt@gmail.com>

gemini-code-assist Bot reviewed Jun 1, 2026

View reviewed changes

hoainho added 4 commits June 1, 2026 13:17

docs(campaign): add awesome-opencode submission guide (YAML workflow,…

473acad

… PR #405)

docs(campaign): add kyrolabs/awesome-agents PR #531 submission guide …

3b7ceef

…+ verified merge-rate table Signed-off-by: Hoài Nhớ <nhoxtvt@gmail.com>

docs(campaign): add real-contributor PR #4151 submission guide (inspe…

e991220

…ct_ai 297-line example) Signed-off-by: Hoài Nhớ <nhoxtvt@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

campaign(2k-stars): community health + broader positioning + docs/ + GitHub Action#30

campaign(2k-stars): community health + broader positioning + docs/ + GitHub Action#30
hoainho wants to merge 5 commits into
mainfrom
campaign/2k-stars

hoainho commented Jun 1, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hoainho commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Why?

What's in this PR (2 commits, 35 files, ~3000 lines)

Commit 1 — community health + docs + GitHub Action

Commit 2 — launch content + KPI + awesome-PR plans + handoff

Awesome-list PRs opened today (outside this PR, under nano-step org)

How is it tested?

Before / after evidence

Issues created alongside this PR

Recommended next step

Checklist

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hoainho commented Jun 1, 2026 •

edited

Loading

Awesome-list PRs opened today (outside this PR, under `nano-step` org)