Skip to content

chore: bump llama.cpp to b9724#19

Open
github-actions[bot] wants to merge 1 commit into
mainfrom
automation/bump-llama-cpp
Open

chore: bump llama.cpp to b9724#19
github-actions[bot] wants to merge 1 commit into
mainfrom
automation/bump-llama-cpp

Conversation

@github-actions

@github-actions github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown

llama.cpp update

Upstream changelog

Release notes for b9724
Details

mtmd: several bug fixes (#24784)

  • mtmd: several bug fixes

  • fix build

  • fix gemma4ua

  • add sanity check in get_u32()

  • fix build (2)

  • area() avoid overflow

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

Commit range

Commits from b9699 to b9724 (first 80)
  • [SYCL] rename GGML_SYCL_SUPPORT_LEVEL_ZERO (#24719) (9724f66)
  • mtmd: refactor preprocessor, add mtmd_image_preproc_out (#24736) (24bba7b)
  • server: fix router args not being forwarded to child instances (#24760) (968c438)
  • server: (router) rework -hf preset repo (#24739) (552258c)
  • server : return HTTP 400 on invalid grammar (#24144) (#24154) (1078621)
  • ui: provide touch accessible model selection UI (#24604) (2083217)
  • server : add last-5-seconds generation speed display (#24291) (0802307)
  • server: add "schema" and validation (#24150) (e1efd09)
  • server: (router) fix stopping_thread potentially hang (#24728) (fe7c8b2)
  • docs: fix export-lora --lora-scaled syntax [no release] (#24703) (7b6c5a2)
  • hexagon: support for op-trace (fine-grain tracing of HVX/HMX/DMA events) (#24592) (d2c6795)
  • mtmd: refactor llava-uhd overview image handling (always use ov_img_first) (#24769) (060ce1b)
  • cmake : fix ui build with read-only source (#24752) (32eddaf)
  • mtmd: add batching for mtmd-cli, add video tests (#24778) (a6b3260)
  • server: add "X-Accel-Buffering": "no" header to streaming endpoints (#24774) (40f3aaf)
  • Ggml/cuda col2im 1d (#24417) (3a3edc9)
  • mtmd: add batching support for internvl (#24775) (db52540)
  • ggml-cpu: support K tails in power10 Q8/Q4 MMA matmul (#24753) (8141e73)
  • server : consolidate slot selection into get_available_slot (#24755) (80452d6)
  • pi : remove docs from system prompt (#24791) (5bd21b8)
  • ggml : bump version to 0.15.2 (ggml/1548) (1868af1)
  • sync : ggml (5fd2dc2)
  • server: fix non-bound n_discard value (ctx shifting) (#24786) (159d093)
  • spec: support eagle3 for qwen3.5 & 3.6 (#24593) (b14e3fb)
  • mtmd: several bug fixes (#24784) (e2e7a9b)

Web bridge review focus

Please pay extra attention to upstream changes touching:

  • WebGPU, WASM, Emscripten, pthreads, or memory64 build behavior
  • ggml backend APIs used by the bridge
  • model loading, tokenizer, chat template, context/state persistence, or cache semantics
  • CMake/build flags that can affect the generated JS/WASM artifacts

Validation

  • Emscripten build passed
  • Browser WebGPU/state-persistence smoke passed
  • Generated bridge artifacts include wasm32 and memory64 outputs
  • No stale hard-coded llama.cpp tag remains in CI/publish defaults

Automation behavior

This PR is managed from the stable branch automation/bump-llama-cpp. If another llama.cpp release appears before merge, the scheduled workflow updates this same PR instead of opening a duplicate. The workflow skips if a non-automation PR already changes llama_cpp.version.

@github-actions github-actions Bot force-pushed the automation/bump-llama-cpp branch from fcc8744 to 37a6ad0 Compare June 19, 2026 13:49
@github-actions github-actions Bot changed the title chore: bump llama.cpp to b9701 chore: bump llama.cpp to b9724 Jun 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant