[audit-workflows] Agentic Workflow Audit — 2026-06-02 Evening (16:51–19:34Z): 97.8% success, but copilot-sdk session.idle timeout reaches producti [Content truncated due to length] #36517
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
Evening incremental audit of the 2026-06-02 16:51–19:34Z window (50 runs). Fleet health is strong — 97.8% success among completed runs (45/46) with the safe-output partial-failure class and token-budget-429 both absent this window. The single failure is the headline finding: the copilot-sdk
session.idle600s timeout has escalated from experimental feature branches to a production scheduled workflow onmain.🔴 Critical: copilot-sdk
session.idletimeout now hits productionmainDaily Security Observability Report (run 26836111514, copilot / claude-sonnet-4.6,
schedule/main) was the only failure this window. On the new copilot-sdksdk-driverpath the session was created, the custom provider resolved with auth (provider=copilot baseUrl=api-proxy:10002), the prompt was sent — then:Why this matters:
patch-diff.githubusercontent.com).session.idlesignature were confined to experimental branches (ab-advisor max-turns campaign). This is the first time it hit a production scheduled workflow onmain.sdk-driverpath is now progressively rolled out to ~14 of 48 copilot+codex runs (~13 genuine production copilot workflows this window). 1 of ~13 prod sdk runs hung (~7%), while the legacy copilot CLI path was 0/34.hasOutput=falseas a startup crash and did not retry — but here the session was created and the prompt was sent, so this is a mid-session hang where a retry would likely have succeeded.Recommended fix (High)
session.idletimeout. The currentno output produced → not retryingrule misclassifies a mid-session hang as a startup crash. The session existed and the prompt was sent; a retry is warranted.session.idlesignal from the headless Copilot server for some prompts (long tool-loop, lost SDK event, or never-emitted idle).✅ Positives this window
isMaxEffectiveTokensExceededErrorsignatures.📊 Trend Charts (last ~14 days)
Daily success/failure volume with the success-rate line. After the 05-23 dip (41.6%, a bad-window outlier) the fleet has held a healthy ~84–96% band; the 06-02 full-day point (84.2%) reflects the morning token-429 + safe-output failures, while this evening slice recovers to 97.8%.
Daily token volume with a 7-day moving average. Usage is trending down — the MA has eased from ~48M to ~39M/day, and 06-02's 32.7M full-day total sits comfortably below the 05-31 spike (68.8M). No cost-runaway signal; heavy daily-aggregation workflows remain the variance drivers.
Recommendations
session.idletimeout and investigate the missing idle signal; de-risk the sdk-driver rollout for heavy production workflows (tracked undercopilot-sdk-session-auth, now escalated).Methodology & scope
logstool,start_date: -1d(window 16:51–19:34Z). Engine classification fromaw_info.jsonengine_id(not lock-file scanning).agent-stdio.log+firewall-summary.json; the lone audit-agent self-match on the failure grep was a false positive (claude engine analyzing other runs' logs).audit-history.jsonl,known-issues.json(copilot-sdk escalated to High),recommendations.json,anomalies.json,metrics-summary.json.References:
Beta Was this translation helpful? Give feedback.
All reactions