Skip to content

feat(groq): parse executed_tools → metadata.builtInToolResults (#69 S5)#73

Merged
stackbilt-admin merged 1 commit into
mainfrom
feat/groq-builtin-tools-response
May 29, 2026
Merged

feat(groq): parse executed_tools → metadata.builtInToolResults (#69 S5)#73
stackbilt-admin merged 1 commit into
mainfrom
feat/groq-builtin-tools-response

Conversation

@stackbilt-admin
Copy link
Copy Markdown
Member

Implements S5 of the Groq built-in-tools sprint (#69), on top of the merged S4 adapter (#72). This completes the end-to-end citation path the S4 docs described.

What

formatResponse now parses message.executed_tools[]LLMResponse.metadata.builtInToolResults:

const citations = res.metadata?.builtInToolResults?.[0]?.results ?? [];
// → [{ title, url, content, score }, …]

Parsing follows the S0-locked wire shape (executed_tools[].search_results.results[], results {title,url,content,score}):

  • keep only executions carrying a non-empty search_results.results;
  • flatten them into results, preserving per-execution type / name / arguments;
  • non-search runs (e.g. code_interpreter) carry no citations and drop out by design; the field is omitted entirely when no search ran.

Also surfaces message.reasoning (the model's internal search queries, present on both families) on metadata.reasoning when present.

Schema: deliberately shallow

GROQ_RESPONSE_SCHEMA gains an optional executed_tools entry that validates only type. Citation sub-fields are intentionally not guarded:

SchemaDriftError routes through the fallback chain — and the fallback host (Cerebras gpt-oss) doesn't run built-in tools. A false drift on a sub-field sampled n=1 in the S0 spike would silently degrade a working search into a tool-less response. The parser soft-degrades instead.

Citation-field coverage lives in a parser unit test — the binding review note's explicitly-accepted alternative to a deep fixture (the committed shape-map fixtures are depth-truncated at search_results, the caveat flagged on #70).

Docs corrected, not just added

The S4 README/CHANGELOG line saying result parsing was "being wired in a follow-up" is now flipped to live, with a metadata.reasoning note and a citations usage snippet.

Tests

typecheck clean; 402 pass (+4): compound flatten with all four citation fields asserted directly, gpt-oss search-only filtering + name preservation, empty-results and no-executed_tools omission. schema-canary suite green — existing tool_calls responses still validate under the extended interface.

Sprint status: S0–S2 (#70), S3 (#71), S4 (#72) merged. This is S5. Remaining: S6 — round-out tests (request-shape parity already covered), CreditLedger doc note on surcharge non-attribution, and any final doc polish.

🤖 Generated with Claude Code

Surfaces Groq's server-side built-in tool results on the response, completing
the citation path the S4 docs described.

- formatResponse maps message.executed_tools[] → metadata.builtInToolResults
  (Array<{ type, name?, arguments?, results: [{title,url,content,score}] }>),
  using the S0-locked rule: keep only executions with a non-empty
  search_results.results, flatten them, preserve per-execution type/name/
  arguments. Non-search runs (code_interpreter) carry no citations and drop
  out by design; the field is omitted entirely when no search ran.
- message.reasoning (the model's internal search queries, present on both
  families) surfaces on metadata.reasoning when present.
- GROQ_RESPONSE_SCHEMA gains an optional, deliberately SHALLOW executed_tools
  entry (validates `type` only). Citation sub-fields are left unguarded:
  SchemaDriftError routes through the fallback chain to a host that doesn't
  run built-in tools, so a false drift on a single-sampled sub-field would
  silently degrade a working search into a tool-less response. The parser
  soft-degrades instead; citation-field coverage is a parser unit test (the
  binding review note's accepted alternative to a deep fixture).

Docs corrected (not just added): the S4 README/CHANGELOG line that said result
parsing was "being wired in a follow-up" is now flipped to live, with a
metadata.reasoning note and a citations usage snippet.

typecheck clean; 402 tests pass (+4: compound flatten with all four citation
fields asserted, gpt-oss search-only filtering + name preservation, empty/no
executed_tools omission). schema-canary suite green (existing tool_calls
responses still validate under the extended interface).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@stackbilt-admin stackbilt-admin merged commit adf0c61 into main May 29, 2026
2 checks passed
@stackbilt-admin stackbilt-admin deleted the feat/groq-builtin-tools-response branch May 31, 2026 10:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant