Skip to content

feat(import): enrich Claude Code import with git/PR metadata and new metrics#6

Open
jmelloy wants to merge 2 commits into
tobilg:mainfrom
jmelloy:import-enhancements
Open

feat(import): enrich Claude Code import with git/PR metadata and new metrics#6
jmelloy wants to merge 2 commits into
tobilg:mainfrom
jmelloy:import-enhancements

Conversation

@jmelloy
Copy link
Copy Markdown

@jmelloy jmelloy commented May 28, 2026

Summary

  • Git & PR metadata: A first-pass scan of each JSONL file now collects gitBranch, cwd, and pr-link entries, then attaches git_branch, repository, pr_number, pr_url attributes to all emitted metrics and transcript logs.
  • New metrics from JSONL: The importer now emits the full set of metrics Claude Code would have sent via OTLP — claude_code.session.count (once per file), claude_code.pull_request.count (from pr-link entries), claude_code.commit.count (from git commit Bash tool calls), claude_code.active_time.total (from system/turn_duration entries), and claude_code.lines_of_code.count broken down by file path and type (from Edit, Write, MultiEdit tool calls).
  • Test coverage: New table-driven tests cover MultiEdit line counting, session.count emission, pr-link deduplication, and the system/turn_duration active-time path.

Motivation

Previously the Claude Code JSONL importer only produced token/cost metrics and transcript logs. This left the imported data significantly less rich than live OTLP telemetry, making historical imports look sparse in the dashboard. This change closes that gap so imported sessions are indistinguishable from live sessions in the metrics views.

Test plan

  • cd backend && go test -v ./internal/importer/... passes
  • Import a real Claude Code session directory and verify the dashboard shows session count, active time, lines-of-code, and commit/PR metrics
  • Re-importing the same directory does not produce duplicate PR metrics (dedup logic in collectSessionMeta)

🤖 Generated with Claude Code

jmelloy and others added 2 commits May 28, 2026 09:19
…metrics

- Capture gitBranch and attach as git_branch attribute to all metrics and logs
- Parse pr-link entries (prNumber, prRepository, prUrl) and emit
  claude_code.pull_request.count metrics; attach PR info to log attributes
- Derive repository dimension from prRepository or cwd for all metrics
- Emit claude_code.lines_of_code.count from Edit/Write/MultiEdit tool calls,
  with file_type and file_path attributes for breakdown by filetype
- Emit claude_code.session.count (one per JSONL file)
- Emit claude_code.commit.count from Bash tool calls containing git commit
- Emit claude_code.active_time.total from system/turn_duration entries
  (one data point per turn, consistent with OTLP telemetry)
- Add tests covering all new metrics and helper functions

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… PR dedup

- TestClaudeParserMultiEdit: verifies MultiEdit tool calls aggregate lines
  added/removed across all edits in the batch
- TestClaudeParserSessionMetric: verifies exactly one session metric per file
  and that duplicate pr-link entries emit only one PR metric

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant