fix(sglang): derive enable_eagle from SpeculativeAlgorithm.is_eagle() (covers EAGLE3) by yifjiang · Pull Request #10982 · ai-dynamo/dynamo

yifjiang · 2026-06-26T05:01:24Z

Problem

components/src/dynamo/sglang/register.py sets ModelRuntimeConfig.enable_eagle = True from a hand-maintained name set ("EAGLE", "NEXTN"). The KV router uses enable_eagle (lib/llm/src/discovery/watcher.rs) to bigram-align the frontend's prompt-block hashes (window = stride + 1, lib/kv-router/src/protocols.rs) so they match the worker's KV events.

But the worker's radix cache bigram-keys its KV-event hashes iff SpeculativeAlgorithm.is_eagle() (srt/managers/scheduler.py) = {EAGLE, EAGLE3, FROZEN_KV_MTP}. The name set had drifted from that predicate:

EAGLE3 was missing → an EAGLE3 worker publishes enable_eagle=false → the frontend hashes at plain page_size while the worker emits bigram-keyed hashes → overlap is always 0 → KV-aware routing is cache-blind for EAGLE3 (falls back to load-only).
"NEXTN" is dead: ServerArgs .upper()s the value and _resolve_speculative_algorithm_alias normalizes NEXTN/EAGLE → EAGLE (or FROZEN_KV_MTP for Gemma4 drafts) before register sees it, so the literal never matches "NEXTN" — and FROZEN_KV_MTP (is_eagle()=true) was also missing.

Fix

Derive enable_eagle from spec_algorithm.is_eagle() — the same predicate the radix cache uses — so the frontend window and the worker's events stay in lockstep by construction; this covers EAGLE3 and FROZEN_KV_MTP and drops the dead literal:

- if server_args.speculative_algorithm in ("EAGLE", "NEXTN"):
+ if _eagle_enabled_for(server_args.speculative_algorithm):   # SpeculativeAlgorithm.from_string(...).is_eagle()
      runtime_config.enable_eagle = True

e2e before/after (public repro)

Stock dynamo + sglang (nvcr.io/nvidia/ai-dynamo/sglang-runtime:1.3.0-dev.1-cuda12, sglang 0.5.12.post1), Qwen/Qwen3-4B + AngelSlim/Qwen3-4B_eagle3 (EAGLE3 draft), single warm node, --router-mode kv --router-kv-events, EAGLE3 spec, a repeated ~930-token prefix (page-size 16 → 58 blocks). Same binary, same models — only register.py differs between phases:

Phase	published `enable_eagle`	KV-router effective cached blocks on the warm repeat
Before (stock `register.py`)	`false`	0.00 — cache-blind every repeat
After (this fix)	`true`	58.00 — full prefix credited

Before, the router's Formula logged with 0.00 effective cached blocks on every warm request (EAGLE3 worker's bigram events never matched the plain-token frontend hashes). After, the same warm prefix logs with 58.00 effective cached blocks — the worker's events now match, so KV-aware routing sees the cache. (The bug and enable_eagle: false were also reproduced on a separate EAGLE3 deployment; this Qwen3-4B run is the public, reproducible demonstration.)

Testing

New parametrized unit test test_eagle_enabled_for_speculative_algorithm pins the enabled set to is_eagle(): EAGLE/EAGLE3/FROZEN_KV_MTP → True; DFLASH/NGRAM/STANDALONE/NONE/None → False — guarding against the set drifting again.
The downstream half is already covered: lib/kv-router/src/protocols.rs::test_compute_block_hash_for_seq_eagle_windows exercises is_eagle = Some(true) → the stride+1 bigram window. So once enable_eagle tracks is_eagle(), EAGLE3/FROZEN_KV_MTP flow through the same validated bigram-window path EAGLE already used.

Scope

Any EAGLE3 (or FROZEN_KV_MTP) model under --router-mode kv with ≥2 workers and worker KV events; single-worker / no-router / no-kv-events unaffected.

github-actions · 2026-06-26T05:01:34Z

👋 Hi yifjiang! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

datadog-official · 2026-06-26T05:20:27Z

⚠️ Warnings

🚦 4 Pipeline jobs failed

PR | deploy-operator

PR | deploy-status-check

PR | dynamo-runtime / rust-gpu

View all 4 failed jobs.

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: e4b95f4 | Docs | Give us feedback!}

… (covers EAGLE3) ModelRuntimeConfig.enable_eagle was set from a hand-maintained name set ("EAGLE", "NEXTN"). The KV router uses enable_eagle to bigram-align the frontend's prompt-block hashes so they match the worker's KV events. But sglang's radix cache bigrams its KV-event hashes iff SpeculativeAlgorithm.is_eagle() (srt/managers/scheduler.py) = {EAGLE, EAGLE3, FROZEN_KV_MTP}, so the name set had drifted from the real predicate: - EAGLE3 was missing -> an EAGLE3 worker publishes enable_eagle=false -> the frontend hashes prompt blocks at plain page_size while the worker emits bigram-keyed hashes -> overlap is always 0 -> KV-aware routing is cache-blind for EAGLE3. - "NEXTN" in the set is dead: ServerArgs normalizes NEXTN/EAGLE to EAGLE (or FROZEN_KV_MTP for Gemma4 drafts) before register sees it, so the literal never matches "NEXTN" -- and FROZEN_KV_MTP (is_eagle()=true) was also missing. Derive enable_eagle from spec_algorithm.is_eagle() so it stays in lockstep with the radix's bigram condition by construction; this covers EAGLE3 and FROZEN_KV_MTP and drops the dead literal. Add a parametrized unit test pinning the enabled set to is_eagle(). Signed-off-by: Yifan Jiang <19356972+yifjiang@users.noreply.github.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

pull-request-size Bot added the size/XS label Jun 26, 2026

yifjiang temporarily deployed to external_collaborator June 26, 2026 05:01 — with GitHub Actions Inactive

copy-pr-bot Bot temporarily deployed to GITLAB June 26, 2026 05:01 Inactive

github-actions Bot added the fix label Jun 26, 2026

github-actions Bot added external-contribution Pull request is from an external contributor backend::sglang Relates to the sglang backend labels Jun 26, 2026

copy-pr-bot Bot temporarily deployed to GITLAB June 26, 2026 05:06 Inactive

yifjiang force-pushed the yifjiang/sglang-eagle3-enable-eagle branch from 635fe3d to f6d54b0 Compare June 26, 2026 05:09

pull-request-size Bot removed the size/XS label Jun 26, 2026

yifjiang temporarily deployed to external_collaborator June 26, 2026 05:09 — with GitHub Actions Inactive

pull-request-size Bot added the size/M label Jun 26, 2026

copy-pr-bot Bot temporarily deployed to GITLAB June 26, 2026 05:09 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB June 26, 2026 05:11 Inactive

yifjiang force-pushed the yifjiang/sglang-eagle3-enable-eagle branch from f6d54b0 to 25ec553 Compare June 26, 2026 05:31

yifjiang temporarily deployed to external_collaborator June 26, 2026 05:31 — with GitHub Actions Inactive

copy-pr-bot Bot temporarily deployed to GITLAB June 26, 2026 05:31 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB June 26, 2026 05:32 Inactive

yifjiang force-pushed the yifjiang/sglang-eagle3-enable-eagle branch from 25ec553 to e4b95f4 Compare June 26, 2026 05:33

yifjiang temporarily deployed to external_collaborator June 26, 2026 05:33 — with GitHub Actions Inactive

copy-pr-bot Bot temporarily deployed to GITLAB June 26, 2026 05:33 Inactive

yifjiang changed the title ~~fix(sglang): include EAGLE3 in enable_eagle so KV-aware routing works for EAGLE3 workers~~ fix(sglang): derive enable_eagle from SpeculativeAlgorithm.is_eagle() (covers EAGLE3) Jun 26, 2026

copy-pr-bot Bot temporarily deployed to GITLAB June 26, 2026 05:37 Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(sglang): derive enable_eagle from SpeculativeAlgorithm.is_eagle() (covers EAGLE3)#10982

fix(sglang): derive enable_eagle from SpeculativeAlgorithm.is_eagle() (covers EAGLE3)#10982
yifjiang wants to merge 1 commit into
ai-dynamo:mainfrom
yifjiang:yifjiang/sglang-eagle3-enable-eagle

yifjiang commented Jun 26, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

datadog-official Bot commented Jun 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

yifjiang commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

e2e before/after (public repro)

Testing

Scope

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

datadog-official Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Warnings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yifjiang commented Jun 26, 2026 •

edited

Loading

datadog-official Bot commented Jun 26, 2026 •

edited

Loading