Skip to content

campaign(2k-stars): community health + broader positioning + docs/ + GitHub Action#30

Open
hoainho wants to merge 5 commits into
mainfrom
campaign/2k-stars
Open

campaign(2k-stars): community health + broader positioning + docs/ + GitHub Action#30
hoainho wants to merge 5 commits into
mainfrom
campaign/2k-stars

Conversation

@hoainho

@hoainho hoainho commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

End-to-end execution of Phase 1 (foundation) + Phase 2 (awesome-list PR wave) + Phase 3 partial (launch content + GitHub Action + KPI script) of the 2000-star contributor-attraction campaign.

Pure additions + a README pitch rewrite — no source-code changes, no test impact.

Why?

The repo has world-class internals but very few discovery surfaces. This PR adds the surfaces (badges, docs, action, issue templates, KPI tracking) and broadens the README pitch from "opencode-skill testing" → "behavior-regression testing for LLM agents (opencode runner today, more coming)".

What's in this PR (2 commits, 35 files, ~3000 lines)

Commit 1 — community health + docs + GitHub Action

Surface Before After
GitHub topics none 14 (llm-evaluation, ai-agents, regression-testing, opencode, llmops, claude, anthropic, ...)
GitHub description "Behavior-regression eval harness for opencode skills..." "Behavior-regression testing for LLM agents. 4-class attribution, 6-field FAIL schema..."
Discussions disabled enabled, 3 seed threads (#27, #28, #29)
README badges none npm / license / tests / stars / discussions / issues / good-first-issues
Hero demo none docs/assets/demo.tape (vhs script) + README GIF placeholder
Comparison docs none concepts.md, comparison.md, runners.md, why-not-promptfoo.md, docs/README.md
Community health CONTRIBUTING.md only + CODE_OF_CONDUCT, SECURITY, FUNDING, 3 issue templates, PR template
CI integration none shipped .github/actions/eval-harness composite action + example workflow

Commit 2 — launch content + KPI + awesome-PR plans + handoff

Surface Content
.campaign/posts/ HN Show post, Reddit (LocalLLaMA + ClaudeAI + mlops), 3 blog posts, X thread — all with response playbooks
.campaign/awesome-pr-bodies/ Step-by-step submission guides for 3 active lists + revisit-later for the rest. Pre-flight check found 4 dead lists I'm explicitly skipping.
scripts/eval/tools/stars-kpi.sh Weekly KPI snapshot — stars, forks, watchers, contributors, unique authors (30d), traffic (views/clones 14d), top referrers, awesome-list star-floor tracker. Tested live: works.
.campaign/CAMPAIGN.md 12-month handoff doc: 14-day critical path, weekly cadence, milestones at 100/500/1000/2000 stars, anti-patterns, what the agent can do in follow-up sessions vs what only humans can do.

Awesome-list PRs opened today (outside this PR, under nano-step org)

How is it tested?

docs-only + config-only + action-only + tooling-only.

The existing 20 test suites under scripts/eval/tests/ are untouched and still green (verify locally with for t in scripts/eval/tests/*.sh; do bash "$t"; done).

The GitHub Action itself is a thin composite over the existing CLI; end-to-end smoke happens when consumed by a downstream repo.

The KPI script was test-run against the live repo and produced expected output (4 stars, 17 views/14d baseline).

Before / after evidence

See the table above. The repo went from "0 surfaces for discovery" → "every surface a stranger needs to evaluate, contribute to, or integrate with eval-harness."

Issues created alongside this PR

Recommended next step

Merge this PR. Then follow the 14-day critical path in .campaign/CAMPAIGN.md. The campaign is set up; the next 12 months are execution.

Checklist

  • No test suites touched — N/A
  • CHANGELOG.md will be updated as part of v0.4.3 (this is meta-campaign work, not a release)
  • Read CONTRIBUTING.md
  • No score.sh or attribute.sh changes — GNU/BSD grep check N/A

…Action

- README: badges row, broader-positioning hero ('regression testing for LLM
  agents'), hero GIF placeholder + docs/* learn-more links.
- .github/: CODE_OF_CONDUCT.md, SECURITY.md, FUNDING.yml, issue templates
  (bug / feature / case-recipe), PR template, ISSUE_TEMPLATE/config.yml.
- docs/: concepts.md (4 core ideas), comparison.md (vs promptfoo /
  DeepEval / Ragas / OpenAI Evals), runners.md (runner abstraction +
  langgraph/claude-agent-sdk roadmap), why-not-promptfoo.md (direct
  head-to-head), docs/README.md index, docs/assets/demo.tape (vhs script).
- .github/actions/eval-harness/: composite GitHub Action (action.yml +
  README.md) — installs jq/yq/opencode/eval-harness, runs against
  changed skills, posts job summary with 6-field FAIL, uploads runs/
  artifact, exit 12 on regression. Action README with quickstart +
  inputs/outputs + examples + pinning + marketplace publishing guide.
- .github/workflows/eval-example.yml: example PR/push integration.

No source-code changes. No test impact.

Part of the campaign/2k-stars roadmap. See [internal doc].

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces comprehensive repository configuration, templates, and documentation, including issue templates, a pull request template, a Code of Conduct, a Security Policy, and detailed conceptual and comparison guides. It also adds a composite GitHub Action (.github/actions/eval-harness/action.yml) to run behavior-regression testing in CI. The review feedback highlights critical improvements for the GitHub Action: preventing pipeline failures when no skills are changed by replacing grep with awk filtering, handling empty BASE_SHA values on new branch pushes to avoid git diff crashes, and explicitly setting EVAL_STATE_DIR to ensure evaluation runs are written to the local workspace for step summaries and artifact uploads.

Comment on lines +120 to +124
CHANGED_SKILLS=$(git diff --name-only "$BASE_SHA" "$HEAD_SHA" \
| grep -E '^.opencode/skills/[^/]+/' \
| awk -F/ '{print $3}' \
| sort -u \
| paste -sd "," -)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Because set -euo pipefail is enabled, if grep finds no matching lines (which happens whenever a commit/PR does not touch any files in .opencode/skills/), it will exit with status 1, causing the entire pipeline and the workflow step to fail. We can avoid this and simplify the pipeline by performing the path filtering directly inside awk.

        CHANGED_SKILLS=$(git diff --name-only "$BASE_SHA" "$HEAD_SHA" \\
          | awk -F/ '$1 == ".opencode" && $2 == "skills" && $3 != "" {print $3}' \\
          | sort -u \\
          | paste -sd "," -)

Comment on lines +112 to +118
if [[ "${{ github.event_name }}" == "pull_request" ]]; then
BASE_SHA="${{ github.event.pull_request.base.sha }}"
HEAD_SHA="${{ github.event.pull_request.head.sha }}"
else
BASE_SHA="${{ github.event.before }}"
HEAD_SHA="${{ github.sha }}"
fi

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

In a push event, if a new branch is pushed, github.event.before is set to 0000000000000000000000000000000000000000. Passing this to git diff will cause a fatal error (fatal: bad object ...) and crash the workflow. We should handle this by falling back to HEAD~1 if it exists, or to HEAD_SHA itself (resulting in an empty diff).

        if [[ "${{ github.event_name }}" == "pull_request" ]]; then
          BASE_SHA="${{ github.event.pull_request.base.sha }}"
          HEAD_SHA="${{ github.event.pull_request.head.sha }}"
        else
          BASE_SHA="${{ github.event.before }}"
          HEAD_SHA="${{ github.sha }}"
          if [[ "$BASE_SHA" == "0000000000000000000000000000000000000000" || -z "$BASE_SHA" ]]; then
            if git rev-parse --verify HEAD~1 >/dev/null 2>&1; then
              BASE_SHA="HEAD~1"
            else
              BASE_SHA="$HEAD_SHA"
            fi
          fi
        fi

Comment on lines +131 to +134
env:
ANTHROPIC_API_KEY: ${{ inputs.anthropic-api-key }}
EVAL_BUDGET_USD: ${{ inputs.budget-usd }}
EVAL_CI: "1"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The action expects the evaluation runs to be written to the local ./runs/ directory for step summary generation and artifact uploading. However, since EVAL_STATE_DIR is not set in the environment, eval-harness defaults to writing runs to $HOME/.config/opencode/eval-harness/runs/. This causes the action to find no runs in the workspace, leading to empty step summaries and missing artifacts. Setting EVAL_STATE_DIR to . resolves this.

      env:
        ANTHROPIC_API_KEY: ${{ inputs.anthropic-api-key }}
        EVAL_BUDGET_USD: ${{ inputs.budget-usd }}
        EVAL_CI: "1"
        EVAL_STATE_DIR: "."

hoainho added 4 commits June 1, 2026 13:17
…dies, handoff doc

- .campaign/posts/01..07: HN Show post, 3 Reddit posts (LocalLLaMA, ClaudeAI,
  mlops), 3 blog posts (4-class attribution, 6-field FAIL, flaky-tests),
  8-tweet X thread. Each includes a response playbook for likely comments.
- .campaign/awesome-pr-bodies/: per-list step-by-step submission guides
  + PR body templates. Pre-flight check found 4 dead/wrong-fit lists
  (Hannibal046/Awesome-LLM unmerged since 2025-07; visenger/awesome-mlops
  unmerged since 2024; e2b-dev/awesome-sdks-for-ai-agents dead since 2023;
  e2b-dev/awesome-ai-agents redirects tools elsewhere). 3 PRs opened today
  to active lists.
- scripts/eval/tools/stars-kpi.sh: read-only weekly KPI snapshot. Captures
  stars, forks, watchers, contributors, unique authors (30d), traffic
  (views/clones 14d), top referrers, top paths. Appends to
  ~/.eval-harness/kpi-history.ndjson. Prints awesome-list star-floor
  milestone tracker (target deferred PR thresholds).
- .campaign/README.md: layout + sequencing.
- .campaign/CAMPAIGN.md: 12-month handoff. Critical path day-by-day for
  first 14 days, weekly cadence months 1-6, success milestones at
  100/500/1000/2000 stars, anti-patterns to avoid, what the agent can do
  in follow-up sessions and what only humans can do.

Awesome-list PRs opened today:
- taishi-i/awesome-ChatGPT-repositories #150
- tensorchord/Awesome-LLMOps #538
- steven2358/awesome-generative-ai #830

No source-code changes.
…+ verified merge-rate table

Signed-off-by: Hoài Nhớ <nhoxtvt@gmail.com>
…ct_ai 297-line example)

Signed-off-by: Hoài Nhớ <nhoxtvt@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant