Detect Windows hardware via CIM, add AMD GPU detection#78
Open
rjckkkkk wants to merge 3 commits into
Open
Conversation
Windows 11 24H2+ removes the legacy wmic CLI, so CPU/RAM detection on modern Windows returned an empty model and zero RAM. AMD GPUs were also invisible on Windows: the probe chain only knows nvidia-smi/rocm-smi and the sysfs fallback is Linux-only, so AMD APU hosts (Ryzen AI Max+ "Strix Halo") detected no accelerator at all. Replace wmic with `powershell Get-CimInstance` for CPU, RAM, pagefile and CPU load, and add a Windows Win32_VideoController GPU fallback wired into detectGPU through a detectPlatformGPU hook (no-op on non-Windows). AMD identity (name/gfx/arch/unified) is resolved from the PCI device ID via the existing amdPCIToInfo, shared with the Linux sysfs path. CIM cannot report true APU VRAM (Win32 AdapterRAM saturates at 4 GiB) or GPU utilization, so VRAM falls back to OS-visible RAM via the existing unified-memory backfill; exact carve-out still needs amd-smi/rocm-smi. Pure CIM JSON parsers live in cim.go (no build tags) with table-driven tests in cim_test.go using fixtures captured from a real Strix Halo box. Verified on AMD Ryzen AI Max+ 395 (Radeon 8060S, gfx1151): hal detect now reports the GPU (RDNA3.5, driver 32.0.31007.1017), CPU model + 16c/32t, and 32 GB RAM. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
On Strix Halo and similar unified-memory APUs, Windows exposes only a fraction of physical memory to the OS (e.g. 32 GiB of 128 GiB) — the rest is carved out for the iGPU. TotalVisibleMemorySize therefore undersold a 128 GiB box as 32 GiB, which also flowed into the unified-VRAM backfill and the onboarding "统一内存" card. Query Win32_PhysicalMemory (sum of DIMM capacity) and use it as RAM.TotalMiB when it exceeds the OS-visible total; recompute AvailableMiB as total - OS-used so it stays correct for both unified and conventional hosts. Tests (cim_test.go, build-tag-free): parse the Measure-Object Sum JSON; override to 128 GiB on a Strix Halo fixture; no-shrink / no-op on conventional hosts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
On Strix Halo, Win32 AdapterRAM saturates at 4 GiB and there is no rocm-smi, so the unified-memory backfill set GPU.VRAMMiB = installed RAM (128 GiB). But the OS carves that 128 GiB pool — only ~110 GiB is GPU-addressable (dedicated VRAM + GTT) — so deploy-fit over-stated usable VRAM and could accept a model the iGPU cannot hold. When the AMD iGPU's VRAM is unknown, query the ROCm-capable llama.cpp engine's own `--list-devices` (preferring AIMA_ENGINE_DIR, else PATH) and use its reported device memory (e.g. "ROCm0: ... (110456 MiB, ...)") as GPU.VRAMMiB. Installed RAM — and thus the normalized "unified memory" the UI shows — is unchanged; only the fit-relevant usable VRAM is corrected. Tests (cim_test.go, build-tag-free): parseLlamaROCmVRAMMiB extracts the device total and ignores non-device output. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
rjckkkkk
added a commit
that referenced
this pull request
Jun 10, 2026
The dist exe filename now carries its version string so a new build does NOT
overwrite the previous one in place. Two builds coexist:
aima-windows-amd64-v0.5-dev-amd-strix-halo-20260610.exe -> v0.5-dev-amd-strix-halo-20260610
latest; adds the out-of-box AMD-HIP llama.cpp engine (#85, #86). serve.bat uses this.
aima-windows-amd64-v0.5-dev-amd-strix-halo.exe -> v0.5-dev-amd-strix-halo
restored 2026-06-09 build (#78-#83 only, no HIP auto-download). Kept for rollback.
Both from source commit fc3ef41; filename == the exe's own `aima version` string.
- #85 llamacpp-hip-windows engine asset (go:embed'd into the exe + in source at
catalog/engines/llamacpp-hip-windows.yaml, pins official b9330 win-hip-radeon-x64):
a no-NVIDIA Strix Halo box auto-downloads the right ROCm/HIP llama.cpp instead of
the CPU-only CUDA universal source.
- #86 native runtime resolves the engine binary against AIMA_ENGINE_DIR
(dist -> AIMA_ENGINE_DIR -> auto-download -> PATH), so a pre-installed llama.cpp
of ANY version is launchable -- the partner's llama.cpp version is supported
whether or not it matches the bundled b9330.
Verified the latest build on the 395 rig: `aima version` ->
v0.5-dev-amd-strix-halo-20260610 / fc3ef41; `aima engine info llamacpp-hip-windows`
-> b9330 asset; `aima hal detect` -> RDNA3.5 gfx1151, ~110 GB VRAM.
README + serve.bat updated (build table; AIMA_ENGINE_DIR now optional).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Windows 11 24H2+ removes the legacy wmic CLI, so CPU/RAM detection on
modern Windows returned an empty model and zero RAM. AMD GPUs were also
invisible on Windows: the probe chain only knows nvidia-smi/rocm-smi and
the sysfs fallback is Linux-only, so AMD APU hosts (Ryzen AI Max+ "Strix
Halo") detected no accelerator at all.
Replace wmic with
powershell Get-CimInstancefor CPU, RAM, pagefile andCPU load, and add a Windows Win32_VideoController GPU fallback wired into
detectGPU through a detectPlatformGPU hook (no-op on non-Windows). AMD
identity (name/gfx/arch/unified) is resolved from the PCI device ID via
the existing amdPCIToInfo, shared with the Linux sysfs path.
CIM cannot report true APU VRAM (Win32 AdapterRAM saturates at 4 GiB) or
GPU utilization, so VRAM falls back to OS-visible RAM via the existing
unified-memory backfill; exact carve-out still needs amd-smi/rocm-smi.
Pure CIM JSON parsers live in cim.go (no build tags) with table-driven
tests in cim_test.go using fixtures captured from a real Strix Halo box.
Verified on AMD Ryzen AI Max+ 395 (Radeon 8060S, gfx1151): hal detect now
reports the GPU (RDNA3.5, driver 32.0.31007.1017), CPU model + 16c/32t,
and 32 GB RAM.
Co-Authored-By: Claude Opus 4.8 (1M context) noreply@anthropic.com