Launch native engines from AIMA_ENGINE_DIR (scanned ⇒ launchable) by rjckkkkk · Pull Request #86 · Approaching-AI/AIMA

rjckkkkk · 2026-06-10T04:54:46Z

Problem

The engine scanner discovers native engine binaries in AIMA_ENGINE_DIR (PR #80), but native deploy resolved the binary only via: dist → BinarySource probe/download → PATH. The probe path is supposed to carry the scanned binary's absolute path into the deploy, but on AMD-Windows the overlay injects it into the hardware-preferred engine asset (llamacpp-vulkan, linux-only) while native deploy resolves to a different asset — so the probe never reaches the launch command.

Result: a pre-installed llama.cpp registered via AIMA_ENGINE_DIR scans fine but won't deploy — the launch command falls back to the bare name llama-server, and Windows errors:

'llama-server' 不是内部或外部命令…   ('llama-server' is not recognized)

Fix

Resolve the native binary against AIMA_ENGINE_DIR as well — the same dirs the engine scanner reads — so scanned ⇒ launchable holds regardless of catalog/overlay engine selection.

internal/runtime/native.go: WithEngineDirs option + findInEngineDirs; resolution order is now dist → AIMA_ENGINE_DIR → auto-download → PATH.
cmd/aima/infra.go: populate engine dirs from AIMA_ENGINE_DIR (mirrors how the scanner reads it).
No-op when AIMA_ENGINE_DIR is unset → other devices/runtimes unaffected. Unit test TestFindInEngineDirsResolvesScannedBinary.

Verified on the 395 (Radeon 8060S, RDNA3.5, Windows)

Reproducing the partner's exact setup — AIMA_ENGINE_DIR pointing at a pre-installed llama.cpp, empty dist:

aima deploy Qwen3.5-9B-Q4_K_M --engine llamacpp

→ running llama-server path = D:\tools\llama-b9180-win-hip-radeon-x64\llama-server.exe (the absolute AIMA_ENGINE_DIR binary, not a bare name / not a download), health 200, fingerprint b9180, GPU offloaded. Before the fix this failed with the "not recognized" error above.

🤖 Generated with Claude Code

The engine scanner discovers native binaries in AIMA_ENGINE_DIR, but native deploy resolved the binary only via dist → BinarySource probe/download → PATH. When a pre-installed engine was registered via AIMA_ENGINE_DIR but not in dist or PATH, and the catalog probe-path injection didn't reach the resolved engine asset (e.g. AMD-Windows, where gpu_arch matching picks a linux-only asset), the launch command fell back to the bare name ("llama-server") and Windows reported "'llama-server' is not recognized" — so a model that scanned fine could not be deployed. Resolve the native binary against AIMA_ENGINE_DIR too — the SAME dirs the engine scanner reads — so "scanned ⇒ launchable" holds regardless of catalog/overlay engine selection. Order: dist → AIMA_ENGINE_DIR → auto-download → PATH. No-op when AIMA_ENGINE_DIR is unset, so other devices/runtimes are unaffected. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

#86) Refresh dist/aima-windows-amd64.exe (v0.5-dev-amd-strix-halo, commit fc3ef41) so the bundled binary now carries the out-of-box AMD-HIP llama.cpp work: - #85 llamacpp-hip-windows engine asset (go:embed'd into the exe): a no-NVIDIA Strix Halo box auto-downloads the official ROCm/HIP llama.cpp (b9330, win-hip-radeon-x64) instead of the CPU-only CUDA universal source. - #86 native runtime resolves the engine binary against AIMA_ENGINE_DIR (dist -> AIMA_ENGINE_DIR -> auto-download -> PATH), so a pre-installed llama.cpp of ANY version is launchable -- the partner's llama.cpp version is supported whether or not it matches the bundled b9330. The catalog YAML ships compiled into the binary, verified on the 395 rig: `aima engine info llamacpp-hip-windows` returns the b9330 asset; `aima version` -> v0.5-dev-amd-strix-halo / fc3ef41; `aima hal detect` -> RDNA3.5 gfx1151, ~110 GB VRAM. README + serve.bat updated: AIMA_ENGINE_DIR is now optional. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

#86) Refresh dist/aima-windows-amd64.exe so the bundled binary carries the out-of-box AMD-HIP llama.cpp work. Version string is date-stamped to be distinguishable from the prior handoff build: aima version -> v0.5-dev-amd-strix-halo-20260610 (commit fc3ef41) (vs the earlier v0.5-dev-amd-strix-halo, which lacked #85/#86) - #85 llamacpp-hip-windows engine asset (go:embed'd into the exe): a no-NVIDIA Strix Halo box auto-downloads the official ROCm/HIP llama.cpp (b9330, win-hip-radeon-x64) instead of the CPU-only CUDA universal source. - #86 native runtime resolves the engine binary against AIMA_ENGINE_DIR (dist -> AIMA_ENGINE_DIR -> auto-download -> PATH), so a pre-installed llama.cpp of ANY version is launchable -- the partner's llama.cpp version is supported whether or not it matches the bundled b9330. The catalog YAML ships compiled into the binary AND in source (catalog/engines/llamacpp-hip-windows.yaml). Verified on the 395 rig: `aima version` -> v0.5-dev-amd-strix-halo-20260610 / fc3ef41; `aima engine info llamacpp-hip-windows` -> b9330 asset; `aima hal detect` -> RDNA3.5 gfx1151, ~110 GB VRAM. README + serve.bat updated: AIMA_ENGINE_DIR is now optional. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The dist exe filename now carries its version string so a new build does NOT overwrite the previous one in place. Two builds coexist: aima-windows-amd64-v0.5-dev-amd-strix-halo-20260610.exe -> v0.5-dev-amd-strix-halo-20260610 latest; adds the out-of-box AMD-HIP llama.cpp engine (#85, #86). serve.bat uses this. aima-windows-amd64-v0.5-dev-amd-strix-halo.exe -> v0.5-dev-amd-strix-halo restored 2026-06-09 build (#78-#83 only, no HIP auto-download). Kept for rollback. Both from source commit fc3ef41; filename == the exe's own `aima version` string. - #85 llamacpp-hip-windows engine asset (go:embed'd into the exe + in source at catalog/engines/llamacpp-hip-windows.yaml, pins official b9330 win-hip-radeon-x64): a no-NVIDIA Strix Halo box auto-downloads the right ROCm/HIP llama.cpp instead of the CPU-only CUDA universal source. - #86 native runtime resolves the engine binary against AIMA_ENGINE_DIR (dist -> AIMA_ENGINE_DIR -> auto-download -> PATH), so a pre-installed llama.cpp of ANY version is launchable -- the partner's llama.cpp version is supported whether or not it matches the bundled b9330. Verified the latest build on the 395 rig: `aima version` -> v0.5-dev-amd-strix-halo-20260610 / fc3ef41; `aima engine info llamacpp-hip-windows` -> b9330 asset; `aima hal detect` -> RDNA3.5 gfx1151, ~110 GB VRAM. README + serve.bat updated (build table; AIMA_ENGINE_DIR now optional). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Launch native engines from AIMA_ENGINE_DIR (scanned ⇒ launchable)#86

Launch native engines from AIMA_ENGINE_DIR (scanned ⇒ launchable)#86
rjckkkkk wants to merge 1 commit into
developfrom
feat/native-engine-dir-resolution

rjckkkkk commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rjckkkkk commented Jun 10, 2026

Problem

Fix

Verified on the 395 (Radeon 8060S, RDNA3.5, Windows)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant