Skip to content

Fix/reasoning close only#2

Open
utkxrsh26 wants to merge 3 commits into
masterfrom
fix/reasoning-close-only
Open

Fix/reasoning close only#2
utkxrsh26 wants to merge 3 commits into
masterfrom
fix/reasoning-close-only

Conversation

@utkxrsh26
Copy link
Copy Markdown
Owner

Description

This PR fixes #

Notes for Reviewers

Signed commits

  • Yes, I signed my commits.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
@codity-dev-utk
Copy link
Copy Markdown

codity-dev-utk Bot commented Jun 4, 2026

PR Summary

What Changed

  • Moved reasoning extraction from pkg/functions to a new pkg/reasoning package with cleaner API and options pattern.
  • Added support for "closing-only" tag detection to handle models like GLM-4 that output reasoning without opening tags.
  • Added ReasoningConfig field to model configuration for explicit control over extraction behavior.

Key Changes by Area

Reasoning Extraction: New package with Extract() function supporting multiple model formats (DeepSeek, GLM-4, Command R7B, Apertus, Seed, Magistral) and streaming content.

Configuration: Added ReasoningConfig struct to model configuration for per-model reasoning settings.

OpenAI Endpoint: Replaced functions.ExtractReasoning with reasoning.Extract and added auto-detection of forced-open thinking mode across streaming and non-streaming paths.

Files Changed

File Changes Summary
core/config/model_config.go Added ReasoningConfig field to model configuration
core/http/endpoints/openai/chat.go Migrated to reasoning.Extract, added auto-detection for forced-open mode
pkg/functions/reasoning.go Deleted (moved to new package)
pkg/functions/reasoning_test.go Deleted (moved to new package)
pkg/reasoning/config.go New file: configuration struct for reasoning extraction
pkg/reasoning/options.go New file: options pattern including WithThinkingForcedOpen()
pkg/reasoning/reasoning.go New file: main extraction logic with tag detection and forced-open handling
pkg/reasoning/reasoning_suite_test.go New file: test suite setup
pkg/reasoning/reasoning_test.go New file: comprehensive test coverage for all tag formats and edge cases

Review Focus Areas

  • Auto-detection logic in DetectThinkingForcedOpen for prompts ending with opening tags.
  • Handling of unclosed and closing-only tags in streaming responses.
  • Backward compatibility: verify no behavior changes for existing model configurations without ReasoningConfig.

Architecture

Design Decisions: The options pattern (WithThinkingForcedOpen) allows explicit control while auto-detection provides sensible defaults. Moving to a dedicated package separates concerns from the functions package.

Scalability & Extensibility: New model formats can be added by extending tag definitions in reasoning.go. Out of scope: this does not handle models with completely unstructured reasoning output.

Risks: Auto-detection relies on prompt inspection which may miss edge cases. This is acceptable as users can override via configuration. The forced-open mode treats all content as reasoning until a closing tag is found, which could misclassify actual response content if the closing tag is missing.

if endIdx != -1 && (startIdx == -1 || endIdx < startIdx) {
// Found a closing tag without a preceding opening tag
closingTagPos := endIdx + lastPos
if earliestStart == -1 || closingTagPos < earliestStart || (isClosingOnly && closingTagPos < earliestEnd) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functional High

The isClosingOnly detection logic is broken. When computing endIdx for the closing-tag-only path, the code uses strings.Index(remaining[lastPos:], tagPair.end): a relative offset: but then compares it directly to startIdx (also a relative offset from lastPos). However, when the condition fires (endIdx < startIdx), the code sets earliestStart = lastPos (treating the entire remaining slice as reasoning) and earliestEnd = closingTagPos + len(tagPair.end) where closingTagPos = endIdx + lastPos. This is correct for that branch. But the bug is in the guard: the condition earliestStart == -1 || closingTagPos < earliestStart || (isClosingOnly && closingTagPos < earliestEnd) mixes earliestStart (repurposed to lastPos) with closingTagPos (absolute). Once isClosingOnly is set, earliestStart is set to lastPos, not to the position of any real tag. Subsequent iterations comparing a new closingTagPos to earliestStart (which equals the old lastPos) will produce incorrect results for content with multiple closing-only tags, potentially swallowing content or producing wrong reasoning boundaries.

Also reported at: pkg/reasoning/reasoning.go L163

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: pkg/reasoning/reasoning.go
Lines: 163-163
Issue Type: functional-high
Severity: high

Issue Description:
The `isClosingOnly` detection logic is broken. When computing `endIdx` for the closing-tag-only path, the code uses `strings.Index(remaining[lastPos:], tagPair.end)`: a relative offset: but then compares it directly to `startIdx` (also a relative offset from `lastPos`). However, when the condition fires (`endIdx < startIdx`), the code sets `earliestStart = lastPos` (treating the entire remaining slice as reasoning) and `earliestEnd = closingTagPos + len(tagPair.end)` where `closingTagPos = endIdx + lastPos`. This is correct for that branch. But the bug is in the guard: the condition `earliestStart == -1 || closingTagPos < earliestStart || (isClosingOnly && closingTagPos < earliestEnd)` mixes `earliestStart` (repurposed to `lastPos`) with `closingTagPos` (absolute). Once `isClosingOnly` is set, `earliestStart` is set to `lastPos`, not to the position of any real tag. Subsequent iterations comparing a new `closingTagPos` to `earliestStart` (which equals the old `lastPos`) will produce incorrect results for content with multiple closing-only tags, potentially swallowing content or producing wrong reasoning boundaries.

_Also reported at: `pkg/reasoning/reasoning.go` L163_

Current Code:
			if earliestStart == -1 || closingTagPos < earliestStart || (isClosingOnly && closingTagPos < earliestEnd) {
					earliest Start = lastPos
					earliest End = closingTagPos + len(tagPair.end)
					isClosingOnly = true
					isUnclosed = false
					matchedTag = tagPair
				}

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue Jira

// Configure reasoning extraction options
// Auto-detect if prompt ends with thinking tag
// or use explicit config setting
thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(s)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functional High

The streaming path computes thinkingForcedOpen using the initial value of s (the first streaming token/chunk) via reasoning.DetectThinkingForcedOpen(s). At the point of evaluation, s is the first chunk emitted by the model: it is NOT the full prompt/predInput. DetectThinkingForcedOpen is designed to check whether the prompt ends with a thinking open tag (to detect models that start outputting reasoning content immediately). Passing the first streamed token instead of the full prompt/predInput defeats the purpose of this check: the first token is model output, not the prompt, so the suffix check will almost never match a thinking tag (and even if it does, it would be coincidental). For the non-streaming and tool-use paths, predInput and prompt are correctly passed. The streaming path should use predInput (or the equivalent prompt variable).

Suggested change
thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(s)
thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(predInput)
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: core/http/endpoints/openai/chat.go
Lines: 50-50
Issue Type: functional-high
Severity: high

Issue Description:
The streaming path computes `thinkingForcedOpen` using the initial value of `s` (the first streaming token/chunk) via `reasoning.DetectThinkingForcedOpen(s)`. At the point of evaluation, `s` is the first chunk emitted by the model: it is NOT the full prompt/predInput. `DetectThinkingForcedOpen` is designed to check whether the *prompt* ends with a thinking open tag (to detect models that start outputting reasoning content immediately). Passing the first streamed token instead of the full prompt/predInput defeats the purpose of this check: the first token is model output, not the prompt, so the suffix check will almost never match a thinking tag (and even if it does, it would be coincidental). For the non-streaming and tool-use paths, `predInput` and `prompt` are correctly passed. The streaming path should use `predInput` (or the equivalent prompt variable).

Current Code:
		thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(s)

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue Jira

// Configure reasoning extraction options
// Auto-detect if prompt ends with thinking tag
// or use explicit config setting
thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(s)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Robustness Medium

thinkingForcedOpen is computed once before ComputeChoices is called, using the value of s in the outer scope (which at that point is the last value assigned from a prior iteration or the zero value). However, predInput (the actual prompt sent to the model) is built later in the function. If predInput is constructed after this line, the detection will be checking the wrong or empty string. The non-streaming and tool-use paths correctly use predInput and prompt after construction. The streaming path should defer this check to inside the callback, or at minimum ensure predInput is available at the point of the check.

Suggested change
thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(s)
thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(predInput)
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: core/http/endpoints/openai/chat.go
Lines: 50-50
Issue Type: robustness-medium
Severity: medium

Issue Description:
`thinkingForcedOpen` is computed once before `ComputeChoices` is called, using the value of `s` in the outer scope (which at that point is the last value assigned from a prior iteration or the zero value). However, `predInput` (the actual prompt sent to the model) is built later in the function. If `predInput` is constructed after this line, the detection will be checking the wrong or empty string. The non-streaming and tool-use paths correctly use `predInput` and `prompt` after construction. The streaming path should defer this check to inside the callback, or at minimum ensure `predInput` is available at the point of the check.

Current Code:
		thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(s)

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue Jira

@codity-dev-utk
Copy link
Copy Markdown

codity-dev-utk Bot commented Jun 4, 2026

Security Scan Summary

Metric Value
Vulnerabilities Critical: 0
Overall Risk Clean
Files Scanned 9

No critical security issues detected

Scan completed in 21.5s

Security scan powered by Codity.ai

@codity-dev-utk
Copy link
Copy Markdown

codity-dev-utk Bot commented Jun 4, 2026

License Compliance Scan

Metric Value
Packages Scanned 334
High Risk (Strong Copyleft) 0
Medium Risk (Weak Copyleft) 4
Low Risk (Permissive) 294
Unknown License 36

Weak copyleft licenses found - verify compatibility

Some packages have unknown licenses - manual review required

Medium Risk Licenses - 4 packages

MPL-2.0 (4 packages):

  • github.com/libp2p/go-yamux/v5 5.0.1
  • github.com/hashicorp/golang-lru/v2 2.0.7
  • github.com/hashicorp/golang-lru 1.0.2
  • github.com/shoenig/go-m1cpu 0.1.6
Unknown Licenses - 36 packages
  • github.com/docker/docker 28.5.2+incompatible
  • github.com/docker/docker-credential-helpers 0.9.3
  • github.com/docker/go-units 0.5.0
  • github.com/docker/go-connections 0.6.0
  • github.com/dsnet/compress 0.0.2-0.20210315054119-f66993602bf5
  • github.com/flynn/noise 1.1.0
  • github.com/go-audio/audio 1.0.0
  • github.com/go-audio/riff 1.0.0
  • github.com/go-logr/stdr 1.2.2
  • github.com/go-logr/logr 1.4.3
  • github.com/go-openapi/jsonpointer 0.21.0
  • github.com/go-openapi/jsonreference 0.21.0
  • github.com/go-openapi/swag 0.23.0
  • github.com/go-openapi/spec 0.21.0
  • github.com/gogo/protobuf 1.3.2
  • github.com/golang/groupcache 0.0.0-20210331224755-41bb18bfe9da
  • github.com/golang/snappy 0.0.4
  • github.com/google/btree 1.1.3
  • github.com/google/go-cmp 0.7.0
  • github.com/google/gopacket 1.1.19

...and 16 more

Powered by Codity.ai · Docs

@utkxrsh26
Copy link
Copy Markdown
Owner Author

@codity review

@codity-dev-utk
Copy link
Copy Markdown

codity-dev-utk Bot commented Jun 4, 2026

PR Summary

What Changed

  • Moved reasoning extraction from pkg/functions to a new dedicated pkg/reasoning package with enhanced functionality for models that output reasoning without explicit opening tags.
  • Added support for GLM-4 and similar models that use "closing-only" tags, where reasoning content appears before a closing delimiter without an opening tag.
  • Integrated auto-detection of forced-open thinking mode based on prompt suffix or explicit model configuration.

Key Changes by Area

Reasoning Extraction: New Extract() function with WithThinkingForcedOpen() option handles models that output reasoning without opening tags. Supports DeepSeek, Command R7B, Apertus, Seed, Magistral, and GLM-4 formats.

Configuration: Added ReasoningConfig field to model configuration in core/config/model_config.go:52.

API Integration: Updated OpenAI chat endpoint to use new reasoning.Extract() with auto-detection of forced-open mode.

Files Changed

File Changes Summary
core/config/model_config.go Added ReasoningConfig field to model configuration
core/http/endpoints/openai/chat.go Updated to use new reasoning.Extract() with auto-detection logic
pkg/functions/reasoning.go Removed (functionality moved to new package)
pkg/functions/reasoning_test.go Removed (tests moved to new package)
pkg/reasoning/config.go New file: configuration types for reasoning extraction
pkg/reasoning/options.go New file: extraction options including WithThinkingForcedOpen()
pkg/reasoning/reasoning.go New file: main extraction logic with support for closing-only tags
pkg/reasoning/reasoning_suite_test.go New file: test suite setup
pkg/reasoning/reasoning_test.go New file: comprehensive tests for all tag formats and edge cases

Review Focus Areas

  • DRY violation in core/http/endpoints/openai/chat.go:244: The detection pattern for thinkingForcedOpen / toolsThinkingForcedOpen / nonStreamThinkingForcedOpen is repeated identically in three code paths (streaming, tools, non-streaming). Consider extracting to a shared helper.
  • Correctness of closing-only tag detection in pkg/reasoning/reasoning.go:118-256 for GLM-4 style output where reasoning precedes the closing tag without an opening tag.
  • Edge case handling in extractFromTags for unclosed tags, nested tags, and streaming scenarios.

Architecture

Design Decisions: The forced-open mode detection is intentionally duplicated across three code paths in the chat endpoint to avoid premature abstraction. The tradeoff accepts maintenance burden for clarity in each path.

Risks: The repeated detection pattern in chat.go is an intentional risk. If more options are added to reasoning extraction, the three sites must be updated consistently or bugs will occur.

// Configure reasoning extraction options
// Auto-detect if prompt ends with thinking tag
// or use explicit config setting
thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(s)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functional High

The streaming path computes thinkingForcedOpen once, before the streaming loop starts, using the initial value of s (the prompt). However, the outer variable s is reassigned inside each streaming callback invocation (func(s string, …)), which shadows the outer s. The real concern is that DetectThinkingForcedOpen(s) is called with whatever s happens to be at closure-capture time: which is the pre-streaming prompt string. This is actually intentional for detection, BUT the detection happens before ComputeChoices is called, so s here is still the prompt, which is correct. However, if s is the raw prompt and hasn't been processed by the template engine yet (i.e., template expansion happens inside ComputeChoices), the detection will be applied against the un-expanded prompt. In contrast, the tools path and non-streaming path use prompt and predInput respectively: variables that presumably hold the already-rendered prompt. This inconsistency means the auto-detection will produce different results in the streaming path vs. the other two paths for the same model/request.

Suggested change
thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(s)
thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(predInput)
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: core/http/endpoints/openai/chat.go
Lines: 50-50
Issue Type: functional-high
Severity: high

Issue Description:
The streaming path computes `thinkingForcedOpen` once, before the streaming loop starts, using the initial value of `s` (the prompt). However, the outer variable `s` is reassigned inside each streaming callback invocation (`func(s string, …)`), which shadows the outer `s`. The real concern is that `DetectThinkingForcedOpen(s)` is called with whatever `s` happens to be at closure-capture time: which is the pre-streaming prompt string. This is actually intentional for detection, BUT the detection happens before `ComputeChoices` is called, so `s` here is still the prompt, which is correct. However, if `s` is the *raw* prompt and hasn't been processed by the template engine yet (i.e., template expansion happens inside `ComputeChoices`), the detection will be applied against the un-expanded prompt. In contrast, the tools path and non-streaming path use `prompt` and `predInput` respectively: variables that presumably hold the already-rendered prompt. This inconsistency means the auto-detection will produce different results in the streaming path vs. the other two paths for the same model/request.

Current Code:
		thinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(s)

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue Jira

endIdx := strings.Index(remaining[lastPos:], tagPair.end)

// Check for closing-only tag (closing tag appears before or without opening tag)
if endIdx != -1 && (startIdx == -1 || endIdx < startIdx) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functional High

The closing-only tag comparison in extractFromTags is incorrect. When a closing-only match is found, earliestStart is set to lastPos and earliestEnd is set to closingTagPos + len(tagPair.end). Later, when choosing among candidates, the guard closingTagPos < earliestStart compares against lastPos (not the position of the previously found closing tag). Additionally isClosingOnly && closingTagPos < earliestEnd compares against earliestEnd which was set to closingTagPos + len(previousTag.end), not closingTagPos. This means when two different closing tags appear at different positions, the selection logic may incorrectly retain a later tag over an earlier one. For example if </think> appears at offset 10 and </thinking> appears at offset 5, the function may still select </think> depending on iteration order. The correct comparator should use the actual closing tag position rather than derived values.

Suggested fix
			if endIdx != -1 && (startIdx == -1 || endIdx < startIdx) {
				// Found a closing tag without a preceding opening tag
				closingTagPos := endIdx + lastPos
				if earliestStart == -1 || (isClosingOnly && closingTagPos < earliestEnd-len(matchedTag.end)) || (!isClosingOnly && (earliestStart == -1 || closingTagPos < earliestStart)) {
					earliestStart = lastPos
					earliestEnd = closingTagPos + len(tagPair.end)
					isClosingOnly = true
					isUnclosed = false
					matchedTag = tagPair
				}
				continue
			}
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: pkg/reasoning/reasoning.go
Lines: 160-160
Issue Type: functional-high
Severity: high

Issue Description:
The closing-only tag comparison in `extractFromTags` is incorrect. When a closing-only match is found, `earliestStart` is set to `lastPos` and `earliestEnd` is set to `closingTagPos + len(tagPair.end)`. Later, when choosing among candidates, the guard `closingTagPos < earliestStart` compares against `lastPos` (not the position of the previously found closing tag). Additionally `isClosingOnly && closingTagPos < earliestEnd` compares against `earliestEnd` which was set to `closingTagPos + len(previousTag.end)`, not `closingTagPos`. This means when two different closing tags appear at different positions, the selection logic may incorrectly retain a later tag over an earlier one. For example if `</think>` appears at offset 10 and `</thinking>` appears at offset 5, the function may still select `</think>` depending on iteration order. The correct comparator should use the actual closing tag position rather than derived values.

Current Code:
			if endIdx != -1 && (startIdx == -1 || endIdx < startIdx) {
				// Found a closing tag without a preceding opening tag
				closingTagPos := endIdx + lastPos
				if earliestStart == -1 || closingTagPos < earliestStart || (isClosingOnly && closingTagPos < earliestEnd) {
					earliestStart = lastPos
					earliestEnd = closingTagPos + len(tagPair.end)
					isClosingOnly = true
					isUnclosed = false
					matchedTag = tagPair
				}
				continue
			}

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue Jira

accumulatedContent += s
// Extract reasoning from accumulated content
currentReasoning, cleanedContent := functions.ExtractReasoning(accumulatedContent)
opts := []reasoning.Option{}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance Medium

Inside the streaming token callback, opts := []reasoning.Option{} allocates a new slice on every invocation, and reasoning.Extract is called on the entire accumulated content (not just the new token). For a long response this results in O(n²) work for reasoning extraction. The opts slice construction is also slightly wasteful: it should be pre-built once outside the callback closure, similar to how the non-streaming path does it.

Suggested change
opts := []reasoning.Option{}
currentReasoning, cleanedContent := reasoning.Extract(accumulatedContent, extractOpts...)
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: core/http/endpoints/openai/chat.go
Lines: 55-55
Issue Type: performance-medium
Severity: medium

Issue Description:
Inside the streaming token callback, `opts := []reasoning.Option{}` allocates a new slice on every invocation, and `reasoning.Extract` is called on the **entire accumulated content** (not just the new token). For a long response this results in O(n²) work for reasoning extraction. The `opts` slice construction is also slightly wasteful: it should be pre-built once outside the callback closure, similar to how the non-streaming path does it.

Current Code:
			opts := []reasoning.Option{}
			if thinkingForcedOpen {
				opts = append(opts, reasoning.WithThinkingForcedOpen())
			}
			currentReasoning, cleanedContent := reasoning.Extract(accumulatedContent, opts...)

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue Jira

if endIdx != -1 && (startIdx == -1 || endIdx < startIdx) {
// Found a closing tag without a preceding opening tag
closingTagPos := endIdx + lastPos
if earliestStart == -1 || closingTagPos < earliestStart || (isClosingOnly && closingTagPos < earliestEnd) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functional Medium

In the closing-only path inside extractFromTags, closingTagPos is computed as endIdx + lastPos, but endIdx is the result of strings.Index(remaining[lastPos:], tagPair.end): i.e., it is already relative to remaining[lastPos:]. Adding lastPos again is correct and gives the absolute position in remaining. However the check closingTagPos < earliestStart compares the absolute position of the new candidate's closing tag against earliestStart, which for a previous closing-only candidate was set to lastPos (NOT to the previous closing tag's position). So a closing tag at position 5 could be skipped in favour of one at position 50 simply because a prior iteration set earliestStart = lastPos = 0. The guard condition must compare the candidate closing tag's position against the previously-matched closing tag's position (i.e., earliestEnd - len(matchedTag.end)).

Also reported at: pkg/reasoning/reasoning.go L211

Suggested fix
				prevClosingTagPos := earliestEnd - len(matchedTag.end)
				if earliestStart == -1 || (!isClosingOnly && closingTagPos < earliestStart) || (isClosingOnly && closingTagPos < prevClosingTagPos) {
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: pkg/reasoning/reasoning.go
Lines: 163-163
Issue Type: functional-medium
Severity: medium

Issue Description:
In the closing-only path inside `extractFromTags`, `closingTagPos` is computed as `endIdx + lastPos`, but `endIdx` is the result of `strings.Index(remaining[lastPos:], tagPair.end)`: i.e., it is already relative to `remaining[lastPos:]`. Adding `lastPos` again is correct and gives the absolute position in `remaining`. However the check `closingTagPos < earliestStart` compares the absolute position of the new candidate's closing tag against `earliestStart`, which for a previous closing-only candidate was set to `lastPos` (NOT to the previous closing tag's position). So a closing tag at position 5 could be skipped in favour of one at position 50 simply because a prior iteration set `earliestStart = lastPos = 0`. The guard condition must compare the candidate closing tag's position against the previously-matched closing tag's position (i.e., `earliestEnd - len(matchedTag.end)`).

_Also reported at: `pkg/reasoning/reasoning.go` L211_

Current Code:
				if earliestStart == -1 || closingTagPos < earliestStart || (isClosingOnly && closingTagPos < earliestEnd) {

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue Jira

lastPos := 0

for {
// Find the earliest tag start
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance Medium

Inside the extractFromTags loop, tagPairs is a slice of anonymous structs allocated on every call. While Go will optimise this to a compile-time constant, the inner loop also calls strings.Index(remaining[lastPos:], tagPair.start) and strings.Index(remaining[lastPos:], tagPair.end) independently, creating two sub-string searches per tag pair per iteration. This is called on every streaming chunk (potentially thousands of times during a single response), and on the entire accumulated content each time. For large responses this is O(n²) in total work. The old implementation had the same issue, but the new multi-tag support doubles the constant factor. Consider operating only on the new suffix of accumulated content rather than re-scanning from the beginning.

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: pkg/reasoning/reasoning.go
Lines: 145-145
Issue Type: performance-medium
Severity: medium

Issue Description:
Inside the `extractFromTags` loop, `tagPairs` is a slice of anonymous structs allocated on every call. While Go will optimise this to a compile-time constant, the inner loop also calls `strings.Index(remaining[lastPos:], tagPair.start)` and `strings.Index(remaining[lastPos:], tagPair.end)` independently, creating two sub-string searches per tag pair per iteration. This is called on every streaming chunk (potentially thousands of times during a single response), and on the *entire accumulated content* each time. For large responses this is O(n²) in total work. The old implementation had the same issue, but the new multi-tag support doubles the constant factor. Consider operating only on the new suffix of accumulated content rather than re-scanning from the beginning.

Current Code:
	for {
		// Find the earliest tag start
		earliestStart := -1
		earliestEnd := -1
		isUnclosed := false
		isClosingOnly := false

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue Jira

}
}

if earliestCloseIdx == -1 {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Robustness Medium

In extractForcedOpen, when no closing tag is found, the entire content is returned as reasoning with TrimSpace applied. This means that during streaming, each incremental call strips leading/trailing whitespace from partial reasoning. If the model is mid-sentence and the partial chunk ends with a space or newline, that trailing whitespace is permanently lost before the final response is assembled. This can mangle the reasoning content, especially for models that produce significant leading whitespace (e.g. <think> → forced open → content starts with a newline that gets trimmed on each streaming tick).

Suggested fix
	if earliestCloseIdx == -1 {
		// No closing tag found - all content is reasoning (still streaming)
		// Do not trim here; trimming is applied only when the closing tag is found
		return content, ""
	}
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: pkg/reasoning/reasoning.go
Lines: 91-91
Issue Type: robustness-medium
Severity: medium

Issue Description:
In `extractForcedOpen`, when no closing tag is found, the entire content is returned as reasoning with `TrimSpace` applied. This means that during streaming, each incremental call strips leading/trailing whitespace from partial reasoning. If the model is mid-sentence and the partial chunk ends with a space or newline, that trailing whitespace is permanently lost before the final response is assembled. This can mangle the reasoning content, especially for models that produce significant leading whitespace (e.g. `<think>
` → forced open → content starts with a newline that gets trimmed on each streaming tick).

Current Code:
	if earliestCloseIdx == -1 {
		// No closing tag found - all content is reasoning (still streaming)
		return strings.TrimSpace(content), ""
	}

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue Jira


// Common thinking/reasoning opening tags used by various models.
// These match the tags detected by llama.cpp in common/chat.cpp
var thinkingOpenTags = []string{
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maintainability Medium

The thinkingOpenTags list used by DetectThinkingForcedOpen and the closing tags list inside extractForcedOpen are not kept in sync with the tag pairs list inside extractFromTags. For example, [THINK] is in thinkingOpenTags but only [THINK] (without newline) is in tagPairs as an opening tag. Similarly, <think> and <thinking> are in thinkingOpenTags but the tag-pairs list uses <think> and <thinking> (without the newline). If a model's prompt ends with <think> (newline variant) the detection fires, but extractFromTags will never find a matching <think> opening tag and will fall back to the closing-only path. The three lists should be consolidated or cross-referenced to avoid drift.

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: pkg/reasoning/reasoning.go
Lines: 9-9
Issue Type: maintainability-medium
Severity: medium

Issue Description:
The `thinkingOpenTags` list used by `DetectThinkingForcedOpen` and the closing tags list inside `extractForcedOpen` are not kept in sync with the tag pairs list inside `extractFromTags`. For example, `[THINK]
` is in `thinkingOpenTags` but only `[THINK]` (without newline) is in `tagPairs` as an opening tag. Similarly, `<think>
` and `<thinking>
` are in `thinkingOpenTags` but the tag-pairs list uses `<think>` and `<thinking>` (without the newline). If a model's prompt ends with `<think>
` (newline variant) the detection fires, but `extractFromTags` will never find a matching `<think>
` opening tag and will fall back to the closing-only path. The three lists should be consolidated or cross-referenced to avoid drift.

Current Code:
var thinkingOpenTags = []string{
	// DeepSeek R1, V3.1, Nemotron V2, MiniMax M2, Hermes 2 Pro, Granite, Exaone MOE
	"<think>\n",
	"<think>",
	// Generic thinking tags
	"<thinking>\n",
	"<thinking>",
	// Apertus
	"<|inner_prefix|>",
	// Command R7B
	"<|START_THINKING|>",
	// Seed
	"<seed:think>",
	// Magistral (not in llama.cpp but common)
	"[THINK]\n",
	"[THINK]",
}

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue Jira

@codity-dev-utk
Copy link
Copy Markdown

codity-dev-utk Bot commented Jun 4, 2026

Nitpicks (Low Priority)

Found 1 low-priority suggestions for code improvement

Click to expand nitpicks

core/http/endpoints/openai/chat.go (line 244)

Maintainability Low

The opts slice and the thinkingForcedOpen / toolsThinkingForcedOpen / nonStreamThinkingForcedOpen detection pattern is repeated identically in three places (streaming path, tools path, non-streaming path). This violates DRY and means any future option added to reasoning extraction must be propagated to all three sites. Extracting a helper like buildReasoningOpts(cfg config.BackendConfig, prompt string) []reasoning.Option would centralise this logic.

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: core/http/endpoints/openai/chat.go
Lines: 244-244
Issue Type: maintainability-low
Severity: low

Issue Description:
The `opts` slice and the `thinkingForcedOpen` / `toolsThinkingForcedOpen` / `nonStreamThinkingForcedOpen` detection pattern is repeated identically in three places (streaming path, tools path, non-streaming path). This violates DRY and means any future option added to reasoning extraction must be propagated to all three sites. Extracting a helper like `buildReasoningOpts(cfg config.BackendConfig, prompt string) []reasoning.Option` would centralise this logic.

Current Code:
		toolsThinkingForcedOpen := config.ReasoningConfig.ThinkingForcedOpen || reasoning.DetectThinkingForcedOpen(prompt)
		opts := []reasoning.Option{}
		if toolsThinkingForcedOpen {
			opts = append(opts, reasoning.WithThinkingForcedOpen())
		}
		extractedReasoning, cleanedResult := reasoning.Extract(result, opts...)

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---



Like Dislike

@codity-dev-utk
Copy link
Copy Markdown

codity-dev-utk Bot commented Jun 4, 2026

Security Scan Summary

Metric Value
Vulnerabilities Critical: 0
Overall Risk Clean
Files Scanned 9

No critical security issues detected

Scan completed in 14.4s

Security scan powered by Codity.ai

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants