Skip to content

Auto-wire llama.cpp --mmproj for VL models (zero-config vision)#90

Open
rjckkkkk wants to merge 1 commit into
developfrom
feat/auto-mmproj-vision
Open

Auto-wire llama.cpp --mmproj for VL models (zero-config vision)#90
rjckkkkk wants to merge 1 commit into
developfrom
feat/auto-mmproj-vision

Conversation

@rjckkkkk

Copy link
Copy Markdown
Collaborator

What

On deploy, when the engine is llamacpp and the model is gguf, auto-detect a co-located mmproj-*.gguf projector (preferring f16) next to the model and inject it as the mmproj config → llama-server gets --mmproj <path> → image input works with no manual config.

Why

GGUF VL models (e.g. Qwen2.5-VL) ship the LLM gguf + an mmproj projector; llama-server needs --mmproj to enable vision. The deploy command never passed it, so VL models served text only unless the user manually added --config mmproj=<path>. Skipped when mmproj is already set or no projector is present → plain LLMs unaffected.

Verified (AMD Strix Halo Win11, llama.cpp b9330)

aima deploy Qwen2.5-VL-3B-Instruct-Q4_K_M --engine llamacpp (no mmproj config) → log auto-wired multimodal projector, command gained --mmproj …mmproj-…-f16.gguf, deploy ready, and sending a solid-color image returned the correct color ("Green", "Blue").

🤖 Generated with Claude Code

GGUF vision models ship a co-located mmproj-*.gguf projector and llama-server
needs --mmproj to accept images, but the deploy never passed it, so VL models
served text only unless the user manually added `--config mmproj=<path>`.

On deploy, when the engine is llamacpp and the model format is gguf, look for a
co-located mmproj-*.gguf next to the model file (preferring an f16 projector) and
inject its path as the `mmproj` config (flows through configToFlags as --mmproj).
Skipped when the caller already set mmproj or no projector is present, so plain
LLMs are unaffected.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
rjckkkkk added a commit that referenced this pull request Jun 12, 2026
)

New version-stamped build aima-windows-amd64-v0.5-dev-amd-strix-halo-20260612.exe
(source commit fa35aa4) on top of the HIP-engine build. Adds, vs the 20260610 exe:

  #87 native deploy readiness uses the real runtime name (no false "not ready")
  #88 deploy launcher hidden (no cmd.exe console popup) via VBS launcher
  #89 Qwen2.5-VL-3B-Instruct catalog knowledge (vlm + aliases + verified perf)
  #90 zero-config vision: llama.cpp --mmproj auto-wired for VL gguf models
  #91 openclaw sync preflight-probes :6188 and warns loudly when unreachable

serve.bat now points at the 20260612 exe; older builds kept for rollback.
README build table + fixes list + OpenClaw data-plane guidance updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant