Fix/reasoning close only by utkxrsh26 · Pull Request #2 · utkxrsh26/LocalAI

utkxrsh26 · 2026-06-04T20:18:08Z

Description

This PR fixes #

Notes for Reviewers

Signed commits

Yes, I signed my commits.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

codity-dev-utk · 2026-06-04T20:18:20Z

PR Summary

What Changed

Moved reasoning extraction from pkg/functions to a new pkg/reasoning package with cleaner API and options pattern.
Added support for "closing-only" tag detection to handle models like GLM-4 that output reasoning without opening tags.
Added ReasoningConfig field to model configuration for explicit control over extraction behavior.

Key Changes by Area

Reasoning Extraction: New package with Extract() function supporting multiple model formats (DeepSeek, GLM-4, Command R7B, Apertus, Seed, Magistral) and streaming content.

Configuration: Added ReasoningConfig struct to model configuration for per-model reasoning settings.

OpenAI Endpoint: Replaced functions.ExtractReasoning with reasoning.Extract and added auto-detection of forced-open thinking mode across streaming and non-streaming paths.

Files Changed

File	Changes Summary
core/config/model_config.go	Added `ReasoningConfig` field to model configuration
core/http/endpoints/openai/chat.go	Migrated to `reasoning.Extract`, added auto-detection for forced-open mode
pkg/functions/reasoning.go	Deleted (moved to new package)
pkg/functions/reasoning_test.go	Deleted (moved to new package)
pkg/reasoning/config.go	New file: configuration struct for reasoning extraction
pkg/reasoning/options.go	New file: options pattern including `WithThinkingForcedOpen()`
pkg/reasoning/reasoning.go	New file: main extraction logic with tag detection and forced-open handling
pkg/reasoning/reasoning_suite_test.go	New file: test suite setup
pkg/reasoning/reasoning_test.go	New file: comprehensive test coverage for all tag formats and edge cases

Review Focus Areas

Auto-detection logic in DetectThinkingForcedOpen for prompts ending with opening tags.
Handling of unclosed and closing-only tags in streaming responses.
Backward compatibility: verify no behavior changes for existing model configurations without ReasoningConfig.

Architecture

Design Decisions: The options pattern (WithThinkingForcedOpen) allows explicit control while auto-detection provides sensible defaults. Moving to a dedicated package separates concerns from the functions package.

Scalability & Extensibility: New model formats can be added by extending tag definitions in reasoning.go. Out of scope: this does not handle models with completely unstructured reasoning output.

Risks: Auto-detection relies on prompt inspection which may miss edge cases. This is acceptable as users can override via configuration. The forced-open mode treats all content as reasoning until a closing tag is found, which could misclassify actual response content if the closing tag is missing.

codity-dev-utk · 2026-06-04T20:24:00Z

+			if endIdx != -1 && (startIdx == -1 || endIdx < startIdx) {
+				// Found a closing tag without a preceding opening tag
+				closingTagPos := endIdx + lastPos
+				if earliestStart == -1 || closingTagPos < earliestStart || (isClosingOnly && closingTagPos < earliestEnd) {


The isClosingOnly detection logic is broken. When computing endIdx for the closing-tag-only path, the code uses strings.Index(remaining[lastPos:], tagPair.end): a relative offset: but then compares it directly to startIdx (also a relative offset from lastPos). However, when the condition fires (endIdx < startIdx), the code sets earliestStart = lastPos (treating the entire remaining slice as reasoning) and earliestEnd = closingTagPos + len(tagPair.end) where closingTagPos = endIdx + lastPos. This is correct for that branch. But the bug is in the guard: the condition earliestStart == -1 || closingTagPos < earliestStart || (isClosingOnly && closingTagPos < earliestEnd) mixes earliestStart (repurposed to lastPos) with closingTagPos (absolute). Once isClosingOnly is set, earliestStart is set to lastPos, not to the position of any real tag. Subsequent iterations comparing a new closingTagPos to earliestStart (which equals the old lastPos) will produce incorrect results for content with multiple closing-only tags, potentially swallowing content or producing wrong reasoning boundaries.

Also reported at: pkg/reasoning/reasoning.go L163

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices. ### Context File: pkg/reasoning/reasoning.go Lines: 163-163 Issue Type: functional-high Severity: high Issue Description: The `isClosingOnly` detection logic is broken. When computing `endIdx` for the closing-tag-only path, the code uses `strings.Index(remaining[lastPos:], tagPair.end)`: a relative offset: but then compares it directly to `startIdx` (also a relative offset from `lastPos`). However, when the condition fires (`endIdx < startIdx`), the code sets `earliestStart = lastPos` (treating the entire remaining slice as reasoning) and `earliestEnd = closingTagPos + len(tagPair.end)` where `closingTagPos = endIdx + lastPos`. This is correct for that branch. But the bug is in the guard: the condition `earliestStart == -1 || closingTagPos < earliestStart || (isClosingOnly && closingTagPos < earliestEnd)` mixes `earliestStart` (repurposed to `lastPos`) with `closingTagPos` (absolute). Once `isClosingOnly` is set, `earliestStart` is set to `lastPos`, not to the position of any real tag. Subsequent iterations comparing a new `closingTagPos` to `earliestStart` (which equals the old `lastPos`) will produce incorrect results for content with multiple closing-only tags, potentially swallowing content or producing wrong reasoning boundaries. _Also reported at: `pkg/reasoning/reasoning.go` L163_ Current Code: if earliestStart == -1 || closingTagPos < earliestStart || (isClosingOnly && closingTagPos < earliestEnd) { earliest Start = lastPos earliest End = closingTagPos + len(tagPair.end) isClosingOnly = true isUnclosed = false matchedTag = tagPair } --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow bash best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-dev-utk · 2026-06-04T20:24:00Z

+		// Configure reasoning extraction options
+		// Auto-detect if prompt ends with thinking tag
+		// or use explicit config setting
+		thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(s)


The streaming path computes thinkingForcedOpen using the initial value of s (the first streaming token/chunk) via reasoning.DetectThinkingForcedOpen(s). At the point of evaluation, s is the first chunk emitted by the model: it is NOT the full prompt/predInput. DetectThinkingForcedOpen is designed to check whether the prompt ends with a thinking open tag (to detect models that start outputting reasoning content immediately). Passing the first streamed token instead of the full prompt/predInput defeats the purpose of this check: the first token is model output, not the prompt, so the suffix check will almost never match a thinking tag (and even if it does, it would be coincidental). For the non-streaming and tool-use paths, predInput and prompt are correctly passed. The streaming path should use predInput (or the equivalent prompt variable).

Suggested change

thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(s)

thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(predInput)

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices. ### Context File: core/http/endpoints/openai/chat.go Lines: 50-50 Issue Type: functional-high Severity: high Issue Description: The streaming path computes `thinkingForcedOpen` using the initial value of `s` (the first streaming token/chunk) via `reasoning.DetectThinkingForcedOpen(s)`. At the point of evaluation, `s` is the first chunk emitted by the model: it is NOT the full prompt/predInput. `DetectThinkingForcedOpen` is designed to check whether the *prompt* ends with a thinking open tag (to detect models that start outputting reasoning content immediately). Passing the first streamed token instead of the full prompt/predInput defeats the purpose of this check: the first token is model output, not the prompt, so the suffix check will almost never match a thinking tag (and even if it does, it would be coincidental). For the non-streaming and tool-use paths, `predInput` and `prompt` are correctly passed. The streaming path should use `predInput` (or the equivalent prompt variable). Current Code: thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(s) --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow bash best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-dev-utk · 2026-06-04T20:24:00Z

+		// Configure reasoning extraction options
+		// Auto-detect if prompt ends with thinking tag
+		// or use explicit config setting
+		thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(s)


thinkingForcedOpen is computed once before ComputeChoices is called, using the value of s in the outer scope (which at that point is the last value assigned from a prior iteration or the zero value). However, predInput (the actual prompt sent to the model) is built later in the function. If predInput is constructed after this line, the detection will be checking the wrong or empty string. The non-streaming and tool-use paths correctly use predInput and prompt after construction. The streaming path should defer this check to inside the callback, or at minimum ensure predInput is available at the point of the check.

Suggested change

thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(s)

thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(predInput)

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices. ### Context File: core/http/endpoints/openai/chat.go Lines: 50-50 Issue Type: robustness-medium Severity: medium Issue Description: `thinkingForcedOpen` is computed once before `ComputeChoices` is called, using the value of `s` in the outer scope (which at that point is the last value assigned from a prior iteration or the zero value). However, `predInput` (the actual prompt sent to the model) is built later in the function. If `predInput` is constructed after this line, the detection will be checking the wrong or empty string. The non-streaming and tool-use paths correctly use `predInput` and `prompt` after construction. The streaming path should defer this check to inside the callback, or at minimum ensure `predInput` is available at the point of the check. Current Code: thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(s) --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow bash best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-dev-utk · 2026-06-04T20:24:30Z

Security Scan Summary

Metric	Value
Vulnerabilities	Critical: 0
Overall Risk	Clean
Files Scanned	9

No critical security issues detected

Scan completed in 21.5s

Security scan powered by Codity.ai

codity-dev-utk · 2026-06-04T20:26:36Z

License Compliance Scan

Metric	Value
Packages Scanned	334
High Risk (Strong Copyleft)	0
Medium Risk (Weak Copyleft)	4
Low Risk (Permissive)	294
Unknown License	36

Weak copyleft licenses found - verify compatibility

Some packages have unknown licenses - manual review required

Medium Risk Licenses - 4 packages

MPL-2.0 (4 packages):

github.com/libp2p/go-yamux/v5 5.0.1
github.com/hashicorp/golang-lru/v2 2.0.7
github.com/hashicorp/golang-lru 1.0.2
github.com/shoenig/go-m1cpu 0.1.6

Unknown Licenses - 36 packages

github.com/docker/docker 28.5.2+incompatible
github.com/docker/docker-credential-helpers 0.9.3
github.com/docker/go-units 0.5.0
github.com/docker/go-connections 0.6.0
github.com/dsnet/compress 0.0.2-0.20210315054119-f66993602bf5
github.com/flynn/noise 1.1.0
github.com/go-audio/audio 1.0.0
github.com/go-audio/riff 1.0.0
github.com/go-logr/stdr 1.2.2
github.com/go-logr/logr 1.4.3
github.com/go-openapi/jsonpointer 0.21.0
github.com/go-openapi/jsonreference 0.21.0
github.com/go-openapi/swag 0.23.0
github.com/go-openapi/spec 0.21.0
github.com/gogo/protobuf 1.3.2
github.com/golang/groupcache 0.0.0-20210331224755-41bb18bfe9da
github.com/golang/snappy 0.0.4
github.com/google/btree 1.1.3
github.com/google/go-cmp 0.7.0
github.com/google/gopacket 1.1.19

...and 16 more

Powered by Codity.ai · Docs

utkxrsh26 · 2026-06-04T20:28:33Z

@codity review

codity-dev-utk · 2026-06-04T20:28:46Z

PR Summary

What Changed

Moved reasoning extraction from pkg/functions to a new dedicated pkg/reasoning package with enhanced functionality for models that output reasoning without explicit opening tags.
Added support for GLM-4 and similar models that use "closing-only" tags, where reasoning content appears before a closing delimiter without an opening tag.
Integrated auto-detection of forced-open thinking mode based on prompt suffix or explicit model configuration.

Key Changes by Area

Reasoning Extraction: New Extract() function with WithThinkingForcedOpen() option handles models that output reasoning without opening tags. Supports DeepSeek, Command R7B, Apertus, Seed, Magistral, and GLM-4 formats.

Configuration: Added ReasoningConfig field to model configuration in core/config/model_config.go:52.

API Integration: Updated OpenAI chat endpoint to use new reasoning.Extract() with auto-detection of forced-open mode.

Files Changed

File	Changes Summary
core/config/model_config.go	Added `ReasoningConfig` field to model configuration
core/http/endpoints/openai/chat.go	Updated to use new `reasoning.Extract()` with auto-detection logic
pkg/functions/reasoning.go	Removed (functionality moved to new package)
pkg/functions/reasoning_test.go	Removed (tests moved to new package)
pkg/reasoning/config.go	New file: configuration types for reasoning extraction
pkg/reasoning/options.go	New file: extraction options including `WithThinkingForcedOpen()`
pkg/reasoning/reasoning.go	New file: main extraction logic with support for closing-only tags
pkg/reasoning/reasoning_suite_test.go	New file: test suite setup
pkg/reasoning/reasoning_test.go	New file: comprehensive tests for all tag formats and edge cases

Review Focus Areas

DRY violation in core/http/endpoints/openai/chat.go:244: The detection pattern for thinkingForcedOpen / toolsThinkingForcedOpen / nonStreamThinkingForcedOpen is repeated identically in three code paths (streaming, tools, non-streaming). Consider extracting to a shared helper.
Correctness of closing-only tag detection in pkg/reasoning/reasoning.go:118-256 for GLM-4 style output where reasoning precedes the closing tag without an opening tag.
Edge case handling in extractFromTags for unclosed tags, nested tags, and streaming scenarios.

Architecture

Design Decisions: The forced-open mode detection is intentionally duplicated across three code paths in the chat endpoint to avoid premature abstraction. The tradeoff accepts maintenance burden for clarity in each path.

Risks: The repeated detection pattern in chat.go is an intentional risk. If more options are added to reasoning extraction, the three sites must be updated consistently or bugs will occur.

codity-dev-utk · 2026-06-04T20:30:25Z

+		// Configure reasoning extraction options
+		// Auto-detect if prompt ends with thinking tag
+		// or use explicit config setting
+		thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(s)


The streaming path computes thinkingForcedOpen once, before the streaming loop starts, using the initial value of s (the prompt). However, the outer variable s is reassigned inside each streaming callback invocation (func(s string, …)), which shadows the outer s. The real concern is that DetectThinkingForcedOpen(s) is called with whatever s happens to be at closure-capture time: which is the pre-streaming prompt string. This is actually intentional for detection, BUT the detection happens before ComputeChoices is called, so s here is still the prompt, which is correct. However, if s is the raw prompt and hasn't been processed by the template engine yet (i.e., template expansion happens inside ComputeChoices), the detection will be applied against the un-expanded prompt. In contrast, the tools path and non-streaming path use prompt and predInput respectively: variables that presumably hold the already-rendered prompt. This inconsistency means the auto-detection will produce different results in the streaming path vs. the other two paths for the same model/request.

Suggested change

thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(s)

thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(predInput)

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices. ### Context File: core/http/endpoints/openai/chat.go Lines: 50-50 Issue Type: functional-high Severity: high Issue Description: The streaming path computes `thinkingForcedOpen` once, before the streaming loop starts, using the initial value of `s` (the prompt). However, the outer variable `s` is reassigned inside each streaming callback invocation (`func(s string, …)`), which shadows the outer `s`. The real concern is that `DetectThinkingForcedOpen(s)` is called with whatever `s` happens to be at closure-capture time: which is the pre-streaming prompt string. This is actually intentional for detection, BUT the detection happens before `ComputeChoices` is called, so `s` here is still the prompt, which is correct. However, if `s` is the *raw* prompt and hasn't been processed by the template engine yet (i.e., template expansion happens inside `ComputeChoices`), the detection will be applied against the un-expanded prompt. In contrast, the tools path and non-streaming path use `prompt` and `predInput` respectively: variables that presumably hold the already-rendered prompt. This inconsistency means the auto-detection will produce different results in the streaming path vs. the other two paths for the same model/request. Current Code: thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(s) --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow bash best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-dev-utk · 2026-06-04T20:30:25Z

+			endIdx := strings.Index(remaining[lastPos:], tagPair.end)
+
+			// Check for closing-only tag (closing tag appears before or without opening tag)
+			if endIdx != -1 && (startIdx == -1 || endIdx < startIdx) {


The closing-only tag comparison in extractFromTags is incorrect. When a closing-only match is found, earliestStart is set to lastPos and earliestEnd is set to closingTagPos + len(tagPair.end). Later, when choosing among candidates, the guard closingTagPos < earliestStart compares against lastPos (not the position of the previously found closing tag). Additionally isClosingOnly && closingTagPos < earliestEnd compares against earliestEnd which was set to closingTagPos + len(previousTag.end), not closingTagPos. This means when two different closing tags appear at different positions, the selection logic may incorrectly retain a later tag over an earlier one. For example if </think> appears at offset 10 and </thinking> appears at offset 5, the function may still select </think> depending on iteration order. The correct comparator should use the actual closing tag position rather than derived values.

Suggested fix

if endIdx != -1 && (startIdx == -1 || endIdx < startIdx) { // Found a closing tag without a preceding opening tag closingTagPos := endIdx + lastPos if earliestStart == -1 || (isClosingOnly && closingTagPos < earliestEnd-len(matchedTag.end)) || (!isClosingOnly && (earliestStart == -1 || closingTagPos < earliestStart)) { earliestStart = lastPos earliestEnd = closingTagPos + len(tagPair.end) isClosingOnly = true isUnclosed = false matchedTag = tagPair } continue }

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices. ### Context File: pkg/reasoning/reasoning.go Lines: 160-160 Issue Type: functional-high Severity: high Issue Description: The closing-only tag comparison in `extractFromTags` is incorrect. When a closing-only match is found, `earliestStart` is set to `lastPos` and `earliestEnd` is set to `closingTagPos + len(tagPair.end)`. Later, when choosing among candidates, the guard `closingTagPos < earliestStart` compares against `lastPos` (not the position of the previously found closing tag). Additionally `isClosingOnly && closingTagPos < earliestEnd` compares against `earliestEnd` which was set to `closingTagPos + len(previousTag.end)`, not `closingTagPos`. This means when two different closing tags appear at different positions, the selection logic may incorrectly retain a later tag over an earlier one. For example if `</think>` appears at offset 10 and `</thinking>` appears at offset 5, the function may still select `</think>` depending on iteration order. The correct comparator should use the actual closing tag position rather than derived values. Current Code: if endIdx != -1 && (startIdx == -1 || endIdx < startIdx) { // Found a closing tag without a preceding opening tag closingTagPos := endIdx + lastPos if earliestStart == -1 || closingTagPos < earliestStart || (isClosingOnly && closingTagPos < earliestEnd) { earliestStart = lastPos earliestEnd = closingTagPos + len(tagPair.end) isClosingOnly = true isUnclosed = false matchedTag = tagPair } continue } --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow bash best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-dev-utk · 2026-06-04T20:30:25Z

 			accumulatedContent += s
 			// Extract reasoning from accumulated content
-			currentReasoning, cleanedContent := functions.ExtractReasoning(accumulatedContent)
+			opts := []reasoning.Option{}


Inside the streaming token callback, opts := []reasoning.Option{} allocates a new slice on every invocation, and reasoning.Extract is called on the entire accumulated content (not just the new token). For a long response this results in O(n²) work for reasoning extraction. The opts slice construction is also slightly wasteful: it should be pre-built once outside the callback closure, similar to how the non-streaming path does it.

Suggested change

opts := []reasoning.Option{}

currentReasoning, cleanedContent := reasoning.Extract(accumulatedContent, extractOpts...)

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices. ### Context File: core/http/endpoints/openai/chat.go Lines: 55-55 Issue Type: performance-medium Severity: medium Issue Description: Inside the streaming token callback, `opts := []reasoning.Option{}` allocates a new slice on every invocation, and `reasoning.Extract` is called on the **entire accumulated content** (not just the new token). For a long response this results in O(n²) work for reasoning extraction. The `opts` slice construction is also slightly wasteful: it should be pre-built once outside the callback closure, similar to how the non-streaming path does it. Current Code: opts := []reasoning.Option{} if thinkingForcedOpen { opts = append(opts, reasoning.WithThinkingForcedOpen()) } currentReasoning, cleanedContent := reasoning.Extract(accumulatedContent, opts...) --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow bash best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-dev-utk · 2026-06-04T20:30:25Z

+			if endIdx != -1 && (startIdx == -1 || endIdx < startIdx) {
+				// Found a closing tag without a preceding opening tag
+				closingTagPos := endIdx + lastPos
+				if earliestStart == -1 || closingTagPos < earliestStart || (isClosingOnly && closingTagPos < earliestEnd) {


In the closing-only path inside extractFromTags, closingTagPos is computed as endIdx + lastPos, but endIdx is the result of strings.Index(remaining[lastPos:], tagPair.end): i.e., it is already relative to remaining[lastPos:]. Adding lastPos again is correct and gives the absolute position in remaining. However the check closingTagPos < earliestStart compares the absolute position of the new candidate's closing tag against earliestStart, which for a previous closing-only candidate was set to lastPos (NOT to the previous closing tag's position). So a closing tag at position 5 could be skipped in favour of one at position 50 simply because a prior iteration set earliestStart = lastPos = 0. The guard condition must compare the candidate closing tag's position against the previously-matched closing tag's position (i.e., earliestEnd - len(matchedTag.end)).

Also reported at: pkg/reasoning/reasoning.go L211

Suggested fix

prevClosingTagPos := earliestEnd - len(matchedTag.end) if earliestStart == -1 || (!isClosingOnly && closingTagPos < earliestStart) || (isClosingOnly && closingTagPos < prevClosingTagPos) {

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices. ### Context File: pkg/reasoning/reasoning.go Lines: 163-163 Issue Type: functional-medium Severity: medium Issue Description: In the closing-only path inside `extractFromTags`, `closingTagPos` is computed as `endIdx + lastPos`, but `endIdx` is the result of `strings.Index(remaining[lastPos:], tagPair.end)`: i.e., it is already relative to `remaining[lastPos:]`. Adding `lastPos` again is correct and gives the absolute position in `remaining`. However the check `closingTagPos < earliestStart` compares the absolute position of the new candidate's closing tag against `earliestStart`, which for a previous closing-only candidate was set to `lastPos` (NOT to the previous closing tag's position). So a closing tag at position 5 could be skipped in favour of one at position 50 simply because a prior iteration set `earliestStart = lastPos = 0`. The guard condition must compare the candidate closing tag's position against the previously-matched closing tag's position (i.e., `earliestEnd - len(matchedTag.end)`). _Also reported at: `pkg/reasoning/reasoning.go` L211_ Current Code: if earliestStart == -1 || closingTagPos < earliestStart || (isClosingOnly && closingTagPos < earliestEnd) { --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow bash best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-dev-utk · 2026-06-04T20:30:25Z

+	lastPos := 0
+
+	for {
+		// Find the earliest tag start


Inside the extractFromTags loop, tagPairs is a slice of anonymous structs allocated on every call. While Go will optimise this to a compile-time constant, the inner loop also calls strings.Index(remaining[lastPos:], tagPair.start) and strings.Index(remaining[lastPos:], tagPair.end) independently, creating two sub-string searches per tag pair per iteration. This is called on every streaming chunk (potentially thousands of times during a single response), and on the entire accumulated content each time. For large responses this is O(n²) in total work. The old implementation had the same issue, but the new multi-tag support doubles the constant factor. Consider operating only on the new suffix of accumulated content rather than re-scanning from the beginning.

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices. ### Context File: pkg/reasoning/reasoning.go Lines: 145-145 Issue Type: performance-medium Severity: medium Issue Description: Inside the `extractFromTags` loop, `tagPairs` is a slice of anonymous structs allocated on every call. While Go will optimise this to a compile-time constant, the inner loop also calls `strings.Index(remaining[lastPos:], tagPair.start)` and `strings.Index(remaining[lastPos:], tagPair.end)` independently, creating two sub-string searches per tag pair per iteration. This is called on every streaming chunk (potentially thousands of times during a single response), and on the *entire accumulated content* each time. For large responses this is O(n²) in total work. The old implementation had the same issue, but the new multi-tag support doubles the constant factor. Consider operating only on the new suffix of accumulated content rather than re-scanning from the beginning. Current Code: for { // Find the earliest tag start earliestStart := -1 earliestEnd := -1 isUnclosed := false isClosingOnly := false --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow bash best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-dev-utk · 2026-06-04T20:30:25Z

+		}
+	}
+
+	if earliestCloseIdx == -1 {


In extractForcedOpen, when no closing tag is found, the entire content is returned as reasoning with TrimSpace applied. This means that during streaming, each incremental call strips leading/trailing whitespace from partial reasoning. If the model is mid-sentence and the partial chunk ends with a space or newline, that trailing whitespace is permanently lost before the final response is assembled. This can mangle the reasoning content, especially for models that produce significant leading whitespace (e.g. <think> → forced open → content starts with a newline that gets trimmed on each streaming tick).

Suggested fix

if earliestCloseIdx == -1 { // No closing tag found - all content is reasoning (still streaming) // Do not trim here; trimming is applied only when the closing tag is found return content, "" }

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices. ### Context File: pkg/reasoning/reasoning.go Lines: 91-91 Issue Type: robustness-medium Severity: medium Issue Description: In `extractForcedOpen`, when no closing tag is found, the entire content is returned as reasoning with `TrimSpace` applied. This means that during streaming, each incremental call strips leading/trailing whitespace from partial reasoning. If the model is mid-sentence and the partial chunk ends with a space or newline, that trailing whitespace is permanently lost before the final response is assembled. This can mangle the reasoning content, especially for models that produce significant leading whitespace (e.g. `<think> ` → forced open → content starts with a newline that gets trimmed on each streaming tick). Current Code: if earliestCloseIdx == -1 { // No closing tag found - all content is reasoning (still streaming) return strings.TrimSpace(content), "" } --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow bash best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-dev-utk · 2026-06-04T20:30:25Z

+
+// Common thinking/reasoning opening tags used by various models.
+// These match the tags detected by llama.cpp in common/chat.cpp
+var thinkingOpenTags = []string{


The thinkingOpenTags list used by DetectThinkingForcedOpen and the closing tags list inside extractForcedOpen are not kept in sync with the tag pairs list inside extractFromTags. For example, [THINK] is in thinkingOpenTags but only [THINK] (without newline) is in tagPairs as an opening tag. Similarly, <think> and <thinking> are in thinkingOpenTags but the tag-pairs list uses <think> and <thinking> (without the newline). If a model's prompt ends with <think> (newline variant) the detection fires, but extractFromTags will never find a matching <think> opening tag and will fall back to the closing-only path. The three lists should be consolidated or cross-referenced to avoid drift.

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices. ### Context File: pkg/reasoning/reasoning.go Lines: 9-9 Issue Type: maintainability-medium Severity: medium Issue Description: The `thinkingOpenTags` list used by `DetectThinkingForcedOpen` and the closing tags list inside `extractForcedOpen` are not kept in sync with the tag pairs list inside `extractFromTags`. For example, `[THINK] ` is in `thinkingOpenTags` but only `[THINK]` (without newline) is in `tagPairs` as an opening tag. Similarly, `<think> ` and `<thinking> ` are in `thinkingOpenTags` but the tag-pairs list uses `<think>` and `<thinking>` (without the newline). If a model's prompt ends with `<think> ` (newline variant) the detection fires, but `extractFromTags` will never find a matching `<think> ` opening tag and will fall back to the closing-only path. The three lists should be consolidated or cross-referenced to avoid drift. Current Code: var thinkingOpenTags = []string{ // DeepSeek R1, V3.1, Nemotron V2, MiniMax M2, Hermes 2 Pro, Granite, Exaone MOE "<think>\n", "<think>", // Generic thinking tags "<thinking>\n", "<thinking>", // Apertus "<|inner_prefix|>", // Command R7B "<|START_THINKING|>", // Seed "<seed:think>", // Magistral (not in llama.cpp but common) "[THINK]\n", "[THINK]", } --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow bash best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-dev-utk · 2026-06-04T20:30:28Z

Nitpicks (Low Priority)

Found 1 low-priority suggestions for code improvement

Click to expand nitpicks

`core/http/endpoints/openai/chat.go` (line 244)

The opts slice and the thinkingForcedOpen / toolsThinkingForcedOpen / nonStreamThinkingForcedOpen detection pattern is repeated identically in three places (streaming path, tools path, non-streaming path). This violates DRY and means any future option added to reasoning extraction must be propagated to all three sites. Extracting a helper like buildReasoningOpts(cfg config.BackendConfig, prompt string) []reasoning.Option would centralise this logic.

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: core/http/endpoints/openai/chat.go
Lines: 244-244
Issue Type: maintainability-low
Severity: low

Issue Description:
The `opts` slice and the `thinkingForcedOpen` / `toolsThinkingForcedOpen` / `nonStreamThinkingForcedOpen` detection pattern is repeated identically in three places (streaming path, tools path, non-streaming path). This violates DRY and means any future option added to reasoning extraction must be propagated to all three sites. Extracting a helper like `buildReasoningOpts(cfg config.BackendConfig, prompt string) []reasoning.Option` would centralise this logic.

Current Code:
		toolsThinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(prompt)
		opts := []reasoning.Option{}
		if toolsThinkingForcedOpen {
			opts = append(opts, reasoning.WithThinkingForcedOpen())
		}
		extractedReasoning, cleanedResult := reasoning.Extract(result, opts...)

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---

codity-dev-utk · 2026-06-04T20:31:33Z

Security Scan Summary

Metric	Value
Vulnerabilities	Critical: 0
Overall Risk	Clean
Files Scanned	9

No critical security issues detected

Scan completed in 14.4s

Security scan powered by Codity.ai

mudler added 3 commits January 20, 2026 11:40

fix(reasoning): handle only closing tags

187e474

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

chore: refactorings

a352125

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Additional thinking tags

61a6e95

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

codity-dev-utk Bot reviewed Jun 4, 2026

View reviewed changes

	thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen \|\| reasoning.DetectThinkingForcedOpen(s)
	thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen \|\| reasoning.DetectThinkingForcedOpen(predInput)

	opts := []reasoning.Option{}
	currentReasoning, cleanedContent := reasoning.Extract(accumulatedContent, extractOpts...)

Conversation

utkxrsh26 commented Jun 4, 2026

Uh oh!

codity-dev-utk Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

What Changed

Key Changes by Area

Files Changed

Review Focus Areas

Architecture

Uh oh!

codity-dev-utk Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

codity-dev-utk Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

codity-dev-utk Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

codity-dev-utk Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Security Scan Summary

Scan completed in 21.5s

Uh oh!

codity-dev-utk Bot commented Jun 4, 2026

License Compliance Scan

Uh oh!

utkxrsh26 commented Jun 4, 2026

Uh oh!

codity-dev-utk Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

What Changed

Key Changes by Area

Files Changed

Review Focus Areas

Architecture

Uh oh!

codity-dev-utk Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

codity-dev-utk Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

codity-dev-utk Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

codity-dev-utk Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

codity-dev-utk Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

codity-dev-utk Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

codity-dev-utk Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

codity-dev-utk Bot commented Jun 4, 2026

Nitpicks (Low Priority)

core/http/endpoints/openai/chat.go (line 244)

Uh oh!

codity-dev-utk Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Security Scan Summary

Scan completed in 14.4s

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codity-dev-utk Bot commented Jun 4, 2026 •

edited

Loading

codity-dev-utk Bot commented Jun 4, 2026 •

edited

Loading

codity-dev-utk Bot commented Jun 4, 2026 •

edited

Loading

`core/http/endpoints/openai/chat.go` (line 244)

codity-dev-utk Bot commented Jun 4, 2026 •

edited

Loading