Launch native engines from AIMA_ENGINE_DIR (scanned ⇒ launchable)#86
Open
rjckkkkk wants to merge 1 commit into
Open
Launch native engines from AIMA_ENGINE_DIR (scanned ⇒ launchable)#86rjckkkkk wants to merge 1 commit into
rjckkkkk wants to merge 1 commit into
Conversation
The engine scanner discovers native binaries in AIMA_ENGINE_DIR, but native
deploy resolved the binary only via dist → BinarySource probe/download → PATH.
When a pre-installed engine was registered via AIMA_ENGINE_DIR but not in dist
or PATH, and the catalog probe-path injection didn't reach the resolved engine
asset (e.g. AMD-Windows, where gpu_arch matching picks a linux-only asset), the
launch command fell back to the bare name ("llama-server") and Windows reported
"'llama-server' is not recognized" — so a model that scanned fine could not be
deployed.
Resolve the native binary against AIMA_ENGINE_DIR too — the SAME dirs the engine
scanner reads — so "scanned ⇒ launchable" holds regardless of catalog/overlay
engine selection. Order: dist → AIMA_ENGINE_DIR → auto-download → PATH. No-op when
AIMA_ENGINE_DIR is unset, so other devices/runtimes are unaffected.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
rjckkkkk
added a commit
that referenced
this pull request
Jun 10, 2026
#86) Refresh dist/aima-windows-amd64.exe (v0.5-dev-amd-strix-halo, commit fc3ef41) so the bundled binary now carries the out-of-box AMD-HIP llama.cpp work: - #85 llamacpp-hip-windows engine asset (go:embed'd into the exe): a no-NVIDIA Strix Halo box auto-downloads the official ROCm/HIP llama.cpp (b9330, win-hip-radeon-x64) instead of the CPU-only CUDA universal source. - #86 native runtime resolves the engine binary against AIMA_ENGINE_DIR (dist -> AIMA_ENGINE_DIR -> auto-download -> PATH), so a pre-installed llama.cpp of ANY version is launchable -- the partner's llama.cpp version is supported whether or not it matches the bundled b9330. The catalog YAML ships compiled into the binary, verified on the 395 rig: `aima engine info llamacpp-hip-windows` returns the b9330 asset; `aima version` -> v0.5-dev-amd-strix-halo / fc3ef41; `aima hal detect` -> RDNA3.5 gfx1151, ~110 GB VRAM. README + serve.bat updated: AIMA_ENGINE_DIR is now optional. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
rjckkkkk
added a commit
that referenced
this pull request
Jun 10, 2026
#86) Refresh dist/aima-windows-amd64.exe so the bundled binary carries the out-of-box AMD-HIP llama.cpp work. Version string is date-stamped to be distinguishable from the prior handoff build: aima version -> v0.5-dev-amd-strix-halo-20260610 (commit fc3ef41) (vs the earlier v0.5-dev-amd-strix-halo, which lacked #85/#86) - #85 llamacpp-hip-windows engine asset (go:embed'd into the exe): a no-NVIDIA Strix Halo box auto-downloads the official ROCm/HIP llama.cpp (b9330, win-hip-radeon-x64) instead of the CPU-only CUDA universal source. - #86 native runtime resolves the engine binary against AIMA_ENGINE_DIR (dist -> AIMA_ENGINE_DIR -> auto-download -> PATH), so a pre-installed llama.cpp of ANY version is launchable -- the partner's llama.cpp version is supported whether or not it matches the bundled b9330. The catalog YAML ships compiled into the binary AND in source (catalog/engines/llamacpp-hip-windows.yaml). Verified on the 395 rig: `aima version` -> v0.5-dev-amd-strix-halo-20260610 / fc3ef41; `aima engine info llamacpp-hip-windows` -> b9330 asset; `aima hal detect` -> RDNA3.5 gfx1151, ~110 GB VRAM. README + serve.bat updated: AIMA_ENGINE_DIR is now optional. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
rjckkkkk
added a commit
that referenced
this pull request
Jun 10, 2026
The dist exe filename now carries its version string so a new build does NOT
overwrite the previous one in place. Two builds coexist:
aima-windows-amd64-v0.5-dev-amd-strix-halo-20260610.exe -> v0.5-dev-amd-strix-halo-20260610
latest; adds the out-of-box AMD-HIP llama.cpp engine (#85, #86). serve.bat uses this.
aima-windows-amd64-v0.5-dev-amd-strix-halo.exe -> v0.5-dev-amd-strix-halo
restored 2026-06-09 build (#78-#83 only, no HIP auto-download). Kept for rollback.
Both from source commit fc3ef41; filename == the exe's own `aima version` string.
- #85 llamacpp-hip-windows engine asset (go:embed'd into the exe + in source at
catalog/engines/llamacpp-hip-windows.yaml, pins official b9330 win-hip-radeon-x64):
a no-NVIDIA Strix Halo box auto-downloads the right ROCm/HIP llama.cpp instead of
the CPU-only CUDA universal source.
- #86 native runtime resolves the engine binary against AIMA_ENGINE_DIR
(dist -> AIMA_ENGINE_DIR -> auto-download -> PATH), so a pre-installed llama.cpp
of ANY version is launchable -- the partner's llama.cpp version is supported
whether or not it matches the bundled b9330.
Verified the latest build on the 395 rig: `aima version` ->
v0.5-dev-amd-strix-halo-20260610 / fc3ef41; `aima engine info llamacpp-hip-windows`
-> b9330 asset; `aima hal detect` -> RDNA3.5 gfx1151, ~110 GB VRAM.
README + serve.bat updated (build table; AIMA_ENGINE_DIR now optional).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The engine scanner discovers native engine binaries in
AIMA_ENGINE_DIR(PR #80), but native deploy resolved the binary only via: dist →BinarySourceprobe/download → PATH. The probe path is supposed to carry the scanned binary's absolute path into the deploy, but on AMD-Windows the overlay injects it into the hardware-preferred engine asset (llamacpp-vulkan, linux-only) while native deploy resolves to a different asset — so the probe never reaches the launch command.Result: a pre-installed llama.cpp registered via
AIMA_ENGINE_DIRscans fine but won't deploy — the launch command falls back to the bare namellama-server, and Windows errors:Fix
Resolve the native binary against
AIMA_ENGINE_DIRas well — the same dirs the engine scanner reads — so scanned ⇒ launchable holds regardless of catalog/overlay engine selection.internal/runtime/native.go:WithEngineDirsoption +findInEngineDirs; resolution order is now dist → AIMA_ENGINE_DIR → auto-download → PATH.cmd/aima/infra.go: populate engine dirs fromAIMA_ENGINE_DIR(mirrors how the scanner reads it).AIMA_ENGINE_DIRis unset → other devices/runtimes unaffected. Unit testTestFindInEngineDirsResolvesScannedBinary.Verified on the 395 (Radeon 8060S, RDNA3.5, Windows)
Reproducing the partner's exact setup —
AIMA_ENGINE_DIRpointing at a pre-installed llama.cpp, empty dist:→ running
llama-serverpath =D:\tools\llama-b9180-win-hip-radeon-x64\llama-server.exe(the absoluteAIMA_ENGINE_DIRbinary, not a bare name / not a download), health 200, fingerprintb9180, GPU offloaded. Before the fix this failed with the "not recognized" error above.🤖 Generated with Claude Code