fix(memos-local-plugin): system message ordering + local endpoint apiKey relaxation#1921
Open
redashes1984 wants to merge 1 commit into
Open
fix(memos-local-plugin): system message ordering + local endpoint apiKey relaxation#1921redashes1984 wants to merge 1 commit into
redashes1984 wants to merge 1 commit into
Conversation
f7172b5 to
63ff5dc
Compare
…hat messages Fixes HTTP 400 errors with Qwen3.6/vLLM which enforces 'system message must be at the beginning' in its Jinja2 chat template. When skill crystallization or reward scoring constructs messages with the system role appearing after user/assistant turns, vLLM's apply_chat_template throws: jinja2.exceptions.TemplateError: System message must be at the beginning. normalizeMessages() now: 1. Extracts all system messages regardless of position 2. Merges them into a single leading system message 3. Appends all non-system messages in original order Includes a fast path: if there's exactly one system already at position 0 with no later systems, returns the original array unchanged.
63ff5dc to
d7955e1
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changes
This PR fixes two issues in
memOS-local-plugincore/llm providers that block self-hosted deployments using local inference servers (e.g., vLLM, llama.cpp) with Qwen3.6 and similar models.Fix 1: System message ordering (HTTP 400 from vLLM)
File:
apps/memos-local-plugin/core/llm/client.tsProblem: When using Qwen3.6 via vLLM,
skill.crystallizeandreward.r_humanLLM calls fail with:Root cause:
normalizeMessages()passed the input array unchanged, so dynamically constructed messages could placerole: "system"after user/assistant turns. Qwen3.6`s Jinja2 chat template enforces system messages at the start.Impact: ~2,300 HTTP 400 errors per day on a single deployment.
Fix:
normalizeMessages()now:Includes a fast path: if there is exactly one system message already at position 0 with no later systems, the original array is returned unchanged (zero allocation overhead).
Fix 2: apiKey is required even for local endpoints
File:
apps/memos-local-plugin/core/llm/providers/openai.tsProblem:
openai_compatibleprovider throwsLLM_UNAVAILABLE: openai_compatible provider requires config.llm.apiKeyeven when the endpoint is a local self-hosted inference server (e.g.,http://10.10.4.8:8000/v1) that does not require authentication.Impact: Users cannot use the
openai_compatibleprovider with local vLLM/llama.cpp deployments without setting a dummy apiKey.Fix: Added
isLocalhostOrPrivateUrl()helper that detects:localhost,127.0.0.1,::110.x.x.x,172.16-31.x.x,192.168.x.xWhen the endpoint is a private/local address,
apiKeybecomes optional. TheAuthorizationheader is only sent whenapiKeyis configured. Cloud endpoints (e.g.,api.openai.com) still requireapiKeyas before.Testing
Verified on a live deployment with Qwen3.6-27B-FP8 via vLLM: