feat(orchestrator): agent transparency + Direct Line + forced delegation (#332) by Weegy · Pull Request #335 · byte5ai/omadia

Weegy · 2026-06-18T13:20:02Z

Why

omadia routes every turn through a single orchestrator, which decides at LLM discretion whether/when/how to delegate to a named specialist and rephrases the result. This produces gatekeeper drift: the orchestrator quietly concentrates competence and mediates everything, so a requested delegation is neither observable nor non-suppressible. That is a blocker for enterprise adoption — most visibly on MS Teams, where the user today has no agent-invocation visibility at all.

This PR keeps the orchestrator in the loop and aware, but removes its destructive powers (suppressing/falsifying a delegation) while preserving its constructive ones, via three independently-shippable layers. Addresses the in-process scope of #332 (the Teams Adaptive-Card rendering lives in the private connector repo and is a follow-up).

What

L1 — Tamper-evident transparency (the minimum)

New contract on SemanticAnswer: agentsConsulted (curated projection of the deterministic runTrace.agentInvocations) + DelegatedAnswer.
toSemanticAnswer builds agentsConsulted from the choke-point trace — not the LLM's prose. The raw runTrace stays dropped at the channel boundary. A fabricated "I asked X" with no real invocation yields an empty list.
Exported agentsConsultedFooterText() plain-text fallback (🔎 Consulted: Strategist ✓ · 2 steps).

L2 — Direct Line (core of the maximum)

New pure module directLine.ts: parseDirectLineDirective (a leading #<specialist> token that survives Teams mention-strip + whitespace-collapse) and resolveDirectLineTarget (resolves only against this orchestrator's whitelisted sub-agents).
Collision rule (Trustworthy sub-agent delegation: tamper-evident transparency, harness-guaranteed Direct Line, and forced delegation #332 Open Q3): a leading #token is treated as a directive only when it resolves to a whitelisted specialist. An unknown token falls through to the normal LLM turn, so ordinary messages that merely start with # (#urgent …, #1 priority …, hashtags) are never hijacked. Ambiguous tokens (matched ≥2 agents) disambiguate instead of routing silently (Pitfall 7).
Orchestrator.executeDirectLine: binds the sub-agent's input to the user's verbatim payload via the deterministic choke point, captures the verbatim answer, and delivers it as a harness-owned, attributed delegatedAnswer the orchestrator can neither remove nor reword. Routed through the same privacyHandle.finalize. Faithful failures, never a cover-up.
Awareness/continuity (Pitfall 5): the verbatim exchange is persisted through the same sessionLogger as a normal turn (KG continuity + cross-session recall + turnId), not just the channel's prior-turn buffer.
Policy: strict passthrough (default) or guarded additive (an attributed ▸ omadia note: may be appended; the verbatim block stays byte-for-byte intact — the no-redaction invariant is structural).

L3 — Forced delegation obligation

ChatTurnInput.expectedDomainTool ports OB-31 to the orchestrator loop: if the turn would end (pure-text stop) without invoking the required sub-agent, the harness escalates once with tool_choice:{type:'tool',name:X} + a synthetic reminder. This forces the consult within the iteration budget; an absolute "cannot end without it" guarantee additionally needs the verifier/postcondition layer (out of scope here, per the issue).

MS Teams note (important)

Teams uses chat() → runTurn → chatInContextInner (non-streaming); web-ui uses chatStream → chatStreamInner (streaming). The issue located L2/L3 only in chatStream, which would have left Teams uncovered — so L2 and L3 are wired into BOTH parallel loops.

Files

harness-channel-sdk/src/{outgoing,toSemanticAnswer,chatAgent,index}.ts — L1 contract, projection, fallback, expectedDomainTool, done-event delegatedAnswer.
harness-orchestrator/src/directLine.ts (new) — pure parser/resolver/label.
harness-orchestrator/src/orchestrator.ts — executeDirectLine, guarded note, session-log persistence, options, obligation port in both loops.
harness-orchestrator/src/index.ts — exports.
test/orchestrator/directLine.test.ts (new) — 20 tests.

Validation

Full build clean (0 TS errors).
New tests 20/20 green: parser/resolver/label + verbatim-preservation; L1 projection + footer; L2 strict verbatim + verbatim-input-binding + LLM-never-called + unknown-token-fall-through + ambiguous-disambiguation + faithful-error + session-log persistence; guarded no-redaction + guarded degrades-to-strict-under-privacy; L3 forces-consult + no-op-when-unset.
Full suite 3503/3503 pass, 0 fail (last run). A non-deterministic, pre-existing test-pollution failure (a different unrelated builder route each run, e.g. builderEditRoutes / runtime-secrets) sometimes appears in the full run but passes in isolation and alongside the new tests — unrelated to this PR.

Acceptance criteria

Teams sees a consulted footer sourced from runTrace, not the LLM; a fabricated consult shows nothing.
A direct-line directive delivers the specialist's verbatim answer, attributed; the orchestrator cannot suppress/reword it.
Guarded-additive keeps the verbatim block byte-for-byte intact (structural; covered by a no-redaction test).
No PII leak to the model provider: guarded-additive runs an extra completion over the answer, so it degrades to strict passthrough whenever a privacy guard is active (tested). The verbatim block is delivered intact regardless. (Note: the user-facing delegated answer is the sub-agent's own LLM output, produced under the per-turn privacy handle.)
Ambiguous token → disambiguation, never a silent wrong route; unknown token → normal LLM turn (no hijack).
Sub-agent errors are delivered faithfully.
[~] (L3) A turn carrying an obligation is forced toward the consult via one OB-31 escalation; an absolute block needs the verifier/postcondition layer (not in this PR). expectedDomainTool is a primitive with no production producer yet — the Conductor (feat(conductor): Spec 005 — Omadia Conductor (deterministic engine, designer, human-in-the-loop) — US1–US8 implemented + live-tested #321) is its intended caller.
No regression to orchestrator memory/recall/privacy/follow-ups/scheduling. Direct-line persists via the session logger AND fires fact-extraction, so the KG learns from delegated answers too.

Independent review

An independent Codex/GPT-5.4 adversarial review was run. Fixes applied from it: streaming turnExternalId parity, truly-verbatim payload (only the separator is stripped), guarded-mode PII degrade-to-strict, and KG fact-extraction parity. Confirmed structurally sound: the no-redaction invariant and the directive collision rule.

Follow-ups (not in this PR)

Teams connector rendering (private repo) after this contract merges.
web-ui curated render of agentsConsulted / delegatedAnswer.
A privacy-aware guarded note (intern the note input) so guarded mode can also run under an active privacy guard.
Non-stream relay does not run the turn-hook side-channel / entity-ref collection (no plan, no orchestrator-level tool calls) — fire them if a future need arises.

Complementary to the Conductor (#321): this PR delivers the per-turn trust primitives; the Conductor composes L3 obligations into multi-step processes.

… forced delegation (#332) Address orchestrator "gatekeeper drift": make a requested sub-agent delegation observable and non-suppressible across every channel (incl. MS Teams), while keeping the orchestrator in the loop and aware. L1 — Transparency: project the deterministic runTrace.agentInvocations into a curated SemanticAnswer.agentsConsulted field (+ DelegatedAnswer contract + a plain-text footer fallback). Teams/Telegram, which never see the raw runTrace, now show which specialist actually ran — sourced from the choke-point trace, not the LLM's prose. A fabricated "I asked X" with no real invocation shows nothing. L2 — Direct Line: a core-parsed `#<specialist>` directive binds the sub-agent's input to the user's verbatim payload via the deterministic choke point and delivers its verbatim answer as a harness-owned, attributed delegatedAnswer the orchestrator can neither suppress nor reword. strict passthrough (default) or guarded additive note (never a redaction). Wired into BOTH the non-streaming (chat()/Teams) and streaming (chatStream/web-ui) paths; unknown/ambiguous tokens disambiguate instead of silently routing; sub-agent errors are delivered faithfully. L3 — Forced delegation: ChatTurnInput.expectedDomainTool ports OB-31 to the orchestrator loop — forces tool_choice + a synthetic reminder when the turn would otherwise end without the required consult. Tests: test/orchestrator/directLine.test.ts (15 new, all green).

…tekeeper-drift-plan

…#332) Confidence-check follow-ups on the #332 Direct Line: - Collision rule (Open Q3): a leading `#token` is now a directive ONLY when it resolves to a whitelisted specialist. An UNKNOWN token falls through to the normal LLM turn, so ordinary messages that merely start with `#` (`#urgent …`, `#1 priority …`, hashtags) are no longer hijacked into a "no such agent" reply. Ambiguous tokens (matched ≥2 agents) still disambiguate. - Awareness/continuity (Pitfall 5): a direct-line turn is now persisted through the same `sessionLogger` as a normal turn (KG continuity + cross-session recall + turnId), instead of relying only on the channel's prior-turn buffer. Tests: +ambiguous-disambiguation, +session-logger persistence, unknown-token now asserts LLM fall-through (17 green).

…verbatim payload (#332) Independent Codex/GPT-5.4 review follow-ups: - Streaming direct-line `onAfterTurn` now carries `turnExternalId` (parity with the normal done branch), so graph-linking observers (#133 E8) fire on direct-line streams too. - Directive payload now keeps internal/trailing whitespace byte-for-byte; only the leading separator is stripped, so a whitespace-significant payload (fenced code block) reaches the sub-agent truly verbatim. An all-whitespace remainder still collapses to an empty payload. Tests 18 green (+verbatim-preservation, +all-whitespace-empty).

Codex review MEDIUM follow-ups: - Guarded-additive PII: the note runs an extra `provider.complete` over the verbatim answer that is NOT routed through the privacy interning path. Degrade guarded → strict whenever a privacy guard is active, so un-masked PII is never forwarded to the model provider. The verbatim block is still delivered intact. - KG-learning parity: a direct-line turn skips chatInContext*, so the knowledge graph never learned from a delegated answer. Fire `factExtractor.extractAndIngest` (fire-and-forget) after the session log lands, mirroring a normal turn. Tests 20 green (+guarded no-redaction, +guarded degrades-to-strict-under-privacy).

CI lint) The direct-line dispatch uses the RunTraceCollector's own observer (handle.observer) for the trace, not the passed one, so the parameter was dead. Removing it (and the stream call-site arg) clears the no-unused-vars lint error.

Weegy added 6 commits June 18, 2026 15:18

Merge remote-tracking branch 'origin/main' into worktree-issue-332-ga…

448ae8b

…tekeeper-drift-plan

Weegy merged commit 8997e4d into main Jun 18, 2026
7 checks passed

Weegy mentioned this pull request Jun 18, 2026

Trustworthy sub-agent delegation: tamper-evident transparency, harness-guaranteed Direct Line, and forced delegation #332

Open

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(orchestrator): agent transparency + Direct Line + forced delegation (#332)#335

feat(orchestrator): agent transparency + Direct Line + forced delegation (#332)#335
Weegy merged 6 commits into
mainfrom
feat/issue-332-gatekeeper-drift

Weegy commented Jun 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Weegy commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

What

L1 — Tamper-evident transparency (the minimum)

L2 — Direct Line (core of the maximum)

L3 — Forced delegation obligation

MS Teams note (important)

Files

Validation

Acceptance criteria

Independent review

Follow-ups (not in this PR)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Weegy commented Jun 18, 2026 •

edited

Loading