Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion dist/index.js

Large diffs are not rendered by default.

18 changes: 16 additions & 2 deletions docs/agentmeter-action-spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,13 +80,26 @@ Every API call and comment post uses `core.warning()` for errors, never `core.se

Partial explicit overrides (e.g. providing only `input_tokens`) merge with the artifact or extracted values for the remaining fields rather than zeroing them out.

### Turns extraction priority

`turns` follows the same precedence pattern:

1. Explicit `turns` input
2. `extractTurnsFromOutput(agent_output)` — tried in order:
- Claude Code JSON: top-level `num_turns` field
- Codex exec JSONL: count of `turn.completed` events
- Regex fallback: `turns: N`, `N turns`, or `turn N of <total>` (captures the total)
3. `null` — shows as `—` in the dashboard

The resolved value is used in both the ingest payload and the PR/issue comment.

### `workflow_run_id` auto-resolution (`src/workflow-run.ts`)

When set, `resolveWorkflowRun` does four things:

1. **Gate** — calls `listJobsForWorkflowRun` and exits early unless a job named `conclusion` has completed. Prevents ~5 duplicate ingests from gh-aw's multi-job structure. Single-job workflows pass through unconditionally.
2. **Status normalization** — maps GitHub conclusions (`failure` → `failed`, `skipped` → skip entirely) to the AgentMeter API enum. Unrecognized statuses (e.g. custom `needs_human`) pass through unchanged.
3. **Trigger resolution** — reads `pull_requests[]` from the run object; falls back to a `pulls.list` lookup by head branch if empty (GitHub API quirk for some PR-triggered runs). Works for fork PRs (`head` parameter is branch name only, not `owner:branch`).
3. **Trigger resolution** — reads `pull_requests[]` from the run object; falls back to a `pulls.list` lookup by head branch if empty (GitHub API quirk for some PR-triggered runs). Works for fork PRs. Issue branches are matched only when the branch name contains `agent/issue-N` (the gh-aw convention) — bare `issue-N` patterns are intentionally not matched to avoid misattributing unrelated branches like `feature/fix-issue-12-auth`.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incomplete documentation: This paragraph doesn't mention the SHA validation that was added to the pulls.list fallback.

The code now validates PRs by head SHA to avoid misattribution when multiple PRs share the same branch name (e.g., reused or fork branches). Consider adding:

"...falls back to a pulls.list lookup by head branch if empty (GitHub API quirk for some PR-triggered runs). When multiple PRs exist for the same branch, the PR is validated by matching the head SHA to prevent misattribution. Works for fork PRs. Issue branches..."

4. **Token artifact** — downloads and unzips the `agent-tokens` artifact using `fflate`.

### Pricing (`src/pricing.ts`)
Expand Down Expand Up @@ -128,7 +141,8 @@ See `action.yml` for the authoritative list. Key inputs:
| `engine` | `claude` / `codex` |
| `status` | Run outcome passed to API |
| `input_tokens` / `output_tokens` / `cache_read_tokens` / `cache_write_tokens` | Explicit token counts (override extraction per-field) |
| `agent_output` | Raw stdout for auto-extraction |
| `turns` | Explicit turn count (auto-extracted from `agent_output` if omitted) |
| `agent_output` | Raw stdout for auto-extraction of tokens and turns |
| `started_at` / `completed_at` | ISO 8601 timestamps (override self-measured) |
| `post_comment` | Whether to upsert a PR/issue comment |
| `api_url` | API base URL (for local dev / self-hosted) |
Expand Down
7 changes: 4 additions & 3 deletions docs/challenges.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ The action reads token counts from an `agent-tokens` artifact uploaded by the ag

### 3. Trigger number is null for non-standard branch names

For `workflow_run` events, the action resolves the PR/issue number from `pull_requests[]` on the run object, then falls back to a `pulls.list` API call by head branch. If the branch name doesn't follow gh-aw's `agent/issue-N` pattern and `pull_requests[]` is empty (common for push-triggered runs), `triggerNumber` is `null` and no comment is posted.
For `workflow_run` events, the action resolves the PR/issue number from `pull_requests[]` on the run object, then falls back to a `pulls.list` API call by head branch. Issue numbers are only inferred when the branch name matches the gh-aw convention `agent/issue-N` exactly — this is intentional to avoid misattributing unrelated branches (e.g. `feature/fix-issue-12-auth`). If the branch doesn't follow this pattern and `pull_requests[]` is empty, `triggerNumber` is `null` and no comment is posted.

**Workaround:** Pass `trigger_number` explicitly as an input to override resolution.

Expand Down Expand Up @@ -140,6 +140,7 @@ The last `token_count` event in the file contains cumulative totals for the full
- `GITHUB_TOKEN` is always available via the `github_token` input default — no config needed.
- Comment upsert (update-in-place) works correctly, including across both old 5-column and current 6-column comment formats.
- All four token types (input, output, cache read, cache write) are tracked when available.
- `turns` is auto-extracted from `agent_output` when not provided explicitly — Claude Code JSON (`num_turns`), Codex exec JSONL (`turn.completed` count), or regex fallback. The resolved value appears in both the ingest payload and the PR/issue comment.
- Partial token overrides: providing only `input_tokens` still falls back to artifact or extracted values for the other fields.

---
Expand All @@ -156,12 +157,12 @@ The last `token_count` event in the file contains cumulative totals for the full
| Token data for non-gh-aw `workflow_run` users | ✅ Documented | See README and challenge #4 above |
| Zip parsing | ✅ | Uses `fflate` — proper decompression |
| `githubRunId` in payload | ✅ | Uses agent run ID when `workflow_run_id` is set |
| Trigger number resolution | ✅ | `pull_requests[]` array + `pulls.list` API fallback; fork PRs supported |
| Trigger number resolution | ✅ | `pull_requests[]` array + `pulls.list` API fallback; issue branch requires `agent/issue-N` prefix (gh-aw convention) to prevent misattribution |
| Trigger type resolution | ✅ | `issue_comment` correctly classified as `pr_comment` vs `issue_comment` |
| Status normalization | ✅ | Raw GitHub conclusion mapped internally; custom statuses pass through unchanged |
| Partial token overrides | ✅ | Per-field merge — partial overrides don't zero out unspecified fields |
| Multiple firing dedup | ✅ for gh-aw | Gate on `conclusion` job name |
| Timestamps / duration | ✅ | Sourced from workflow run API; guards against NaN from invalid timestamps |
| Timestamps / duration | ✅ | Sourced from workflow run API; `null` when unavailable (falls back to action start/now); duration clamped to ≥0 |
| Workflow name | ✅ | Uses agent workflow name, not companion |
| Comment format migration | ✅ | Old 5-column comments parsed correctly |
| Comment ordering | ✅ | Newest runs displayed first; 5 visible, rest in collapsible section |
Expand Down
22 changes: 20 additions & 2 deletions docs/testing.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ npm run test:watch

The test suite covers:

- **`token-extractor.test.ts`** — JSON and regex-based token extraction, priority logic, edge cases
- **`token-extractor.test.ts`** — JSON and regex-based token extraction, turns extraction (Claude Code JSON `num_turns`, Codex exec JSONL `turn.completed` count, regex fallback, null cases), priority logic, edge cases
- **`context.test.ts`** — GitHub event → trigger type mapping, trigger ref extraction for all event types, `issue_comment` PR vs issue distinction
- **`comment.test.ts`** — Markdown comment formatting, status emojis, cost formatting, multi-run accumulation, newest-first ordering, 5-run visible limit with collapsible history, old/new column format compatibility
- **`ingest.test.ts`** — API client success/failure handling, retry logic, Authorization header
Expand Down Expand Up @@ -96,6 +96,24 @@ export INPUT_AGENT_OUTPUT='{"usage":{"input_tokens":1000,"output_tokens":500,"ca
node dist/index.js
```

### 6. Test turns auto-extraction

```bash
# Claude Code JSON with num_turns
export INPUT_AGENT_OUTPUT='{"num_turns":7,"usage":{"input_tokens":1000,"output_tokens":500}}'
node dist/index.js

# Codex exec JSONL turn.completed events
export INPUT_AGENT_OUTPUT='{"type":"turn.completed"}
{"type":"turn.completed"}
{"type":"turn.completed"}'
node dist/index.js

# Explicit turns override (auto-extraction ignored)
export INPUT_TURNS="5"
node dist/index.js
```

---

## Testing against a local AgentMeter API
Expand Down Expand Up @@ -132,7 +150,7 @@ It uses hardcoded synthetic token counts and a `sleep 10` step so the reported d

1. Ensure all tests pass: `npm test`
2. Run type check: `npm run type-check`
3. Commit everything — the pre-commit hook automatically runs `npm run build` and stages `dist/index.js` and `dist/licenses.txt`
3. Commit everything — the pre-commit hook runs lint-staged (Biome auto-fix), then `npm run lint` as a hard gate, then `npm run type-check`, then `npm run build`, and finally stages `dist/index.js` and `dist/licenses.txt`
4. Push to `main`
5. Create a GitHub release with a semver tag: `v1.0.0`
6. Update the major version tag: `git tag -f v1 && git push -f origin v1`
Expand Down
4 changes: 3 additions & 1 deletion src/run.ts
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,9 @@ export async function run(): Promise<void> {
if (!inputs.triggerEvent) resolvedTriggerEvent = runData.triggerEvent;
resolvedTriggerRef = runData.triggerRef;
resolvedTriggerType = runData.triggerType;
if (runData.workflowName) resolvedWorkflowName = runData.workflowName;
// Always use the agent workflow's name from the run data — ctx.workflowName is the
// companion workflow and would misattribute if used as a fallback here.
resolvedWorkflowName = runData.workflowName;
workflowRunTokens = runData.tokens;
}
}
Expand Down
17 changes: 13 additions & 4 deletions src/workflow-run.ts
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ export async function resolveWorkflowRun({
const { triggerNumber, triggerEvent, triggerType, triggerRef } = await resolveTrigger({
pullRequests: run?.pull_requests ?? [],
headBranch: run?.head_branch ?? '',
headSha: run?.head_sha ?? '',
event: run?.event ?? '',
octokit,
owner,
Expand Down Expand Up @@ -223,6 +224,7 @@ async function fetchRun({
run_started_at?: string | null;
updated_at?: string | null;
head_branch?: string | null;
head_sha?: string | null;
event?: string | null;
name?: string | null;
pull_requests?: Array<{ number: number }>;
Expand All @@ -237,6 +239,7 @@ async function fetchRun({
run_started_at: data.run_started_at,
updated_at: data.updated_at,
head_branch: data.head_branch,
head_sha: data.head_sha,
event: data.event,
name: data.name,
pull_requests: (data.pull_requests ?? []).map((pr) => ({ number: pr.number })),
Expand All @@ -255,6 +258,7 @@ async function fetchRun({
*/
async function resolveTrigger({
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code style: Function parameters should be in alphabetical order per the repo's style guide.

Current: headBranch, headSha, event, octokit, owner, pullRequests, repo
Expected: event, headBranch, headSha, octokit, owner, pullRequests, repo

(Note: This was pre-existing, but since the PR touches these lines, it's a good opportunity to fix it.)

headBranch,
headSha,
event,
octokit,
owner,
Expand All @@ -263,6 +267,8 @@ async function resolveTrigger({
}: {
/** Head branch name of the triggering run */
headBranch: string;
/** Head commit SHA of the triggering run — used to validate the PR fallback match */
headSha: string;
/** Event that triggered the original workflow run */
event: string;
/** Authenticated Octokit instance */
Expand Down Expand Up @@ -301,17 +307,20 @@ async function resolveTrigger({
// Omit owner prefix so forked PRs are also matched (fork owner differs from base owner)
head: headBranch,
owner,
per_page: 1,
per_page: 5,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage: The change from per_page: 1 to per_page: 5 and the new SHA validation logic (line 317) aren't covered by tests.

The existing test at __tests__/workflow-run.test.ts:156 mocks pulls.list but doesn't include head.sha in the mock data, so it doesn't exercise the SHA matching logic.

Add a test that verifies:

  • When multiple PRs exist for the same branch
  • The PR with matching head.sha is selected
  • Fallback to first PR when SHA doesn't match

repo,
sort: 'updated',
state: 'all',
});
if (prs[0]) {
// Validate by head SHA when available to avoid matching the wrong PR when multiple
// PRs share the same branch name (e.g. reused or fork branches).
const match = headSha ? (prs.find((pr) => pr.head.sha === headSha) ?? prs[0]) : prs[0];
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type safety violation: The pr parameter has an implicit any type, which fails strict TypeScript mode (TS7006).

Add an explicit type annotation:

const match = headSha ? (prs.find((pr: { head: { sha: string } }) => pr.head.sha === headSha) ?? prs[0]) : prs[0];

Or better yet, destructure the data with proper Octokit types from RestEndpointMethodTypes.

if (match) {
return {
triggerNumber: prs[0].number,
triggerNumber: match.number,
triggerEvent: event,
triggerType: normalizeTriggerType({ event, isPR: true }),
triggerRef: `PR #${prs[0].number}`,
triggerRef: `PR #${match.number}`,
};
}
} catch (error) {
Expand Down
Loading