fix: use max_tokens for openai provider with custom base URL (Mistral compatibility) by octo-patch · Pull Request #858 · vectorize-io/hindsight

octo-patch · 2026-04-03T06:49:40Z

Fixes #852

Problem

When using HINDSIGHT_API_LLM_PROVIDER=openai with HINDSIGHT_API_LLM_BASE_URL pointing to Mistral's API (https://api.mistral.ai/v1), Hindsight crashes on startup during verify_connection because max_completion_tokens is sent but Mistral only accepts max_tokens, returning a 422:

openai.UnprocessableEntityError: Error code: 422 - {'message': {'detail': [{'type': 'extra_forbidden', 'loc': ['body', 'max_completion_tokens'], ...}]}}
RuntimeError: Connection verification failed for openai/mistral-small-latest

Solution

Add _max_tokens_param_name() method that selects the correct parameter name based on provider and base URL:

Native OpenAI (no custom base_url) → max_completion_tokens
Groq → max_completion_tokens (Groq supports the newer name)
openai with custom base_url, ollama, lmstudio, minimax, volcano → max_tokens

A custom base_url on the openai provider signals a third-party compatible API (Mistral, Together AI, etc.) that may not have adopted max_completion_tokens.

Testing

Existing tests pass unchanged (they test the call() argument name, not the API key)
The fix covers both call() and call_with_tools()

Mistral (and several other providers) reject 'max_completion_tokens' with a 422 because they haven't adopted the newer OpenAI parameter name. When the openai provider is configured with a custom base_url (e.g. Mistral, Together AI), fall back to the widely-supported 'max_tokens' parameter. Native OpenAI (no custom base_url) and Groq still use 'max_completion_tokens'. Fixes vectorize-io#852

nicoloboschi

LGTM

…penAI PR #858 made the openai provider fall back to max_tokens whenever a custom base_url was set, to support Mistral/Together-style endpoints. This regressed two important setups: 1. Reasoning models (GPT-5, o1, o3) reject max_tokens outright with a 400 ("Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead."). 2. Azure OpenAI is fully OpenAI-API-compatible — it was only classified as "third-party compatible" because it requires a custom base_url. The combination of the two — Azure OpenAI + GPT-5 — is the exact setup the reporter hit in issue #978 and fails connection verification on startup. Fix _max_tokens_param_name() so it: - Always returns max_completion_tokens for reasoning models, regardless of base_url (they only support the new parameter name). - Detects Azure OpenAI endpoints by the *.openai.azure.com hostname and treats them as native OpenAI. The Mistral/Together behavior from #858 is preserved for non-reasoning models on non-Azure custom base URLs. Fixes #978

…penAI (#979) PR #858 made the openai provider fall back to max_tokens whenever a custom base_url was set, to support Mistral/Together-style endpoints. This regressed two important setups: 1. Reasoning models (GPT-5, o1, o3) reject max_tokens outright with a 400 ("Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead."). 2. Azure OpenAI is fully OpenAI-API-compatible — it was only classified as "third-party compatible" because it requires a custom base_url. The combination of the two — Azure OpenAI + GPT-5 — is the exact setup the reporter hit in issue #978 and fails connection verification on startup. Fix _max_tokens_param_name() so it: - Always returns max_completion_tokens for reasoning models, regardless of base_url (they only support the new parameter name). - Detects Azure OpenAI endpoints by the *.openai.azure.com hostname and treats them as native OpenAI. The Mistral/Together behavior from #858 is preserved for non-reasoning models on non-Azure custom base URLs. Fixes #978

nicoloboschi approved these changes Apr 7, 2026

View reviewed changes

nicoloboschi merged commit cd99eef into vectorize-io:main Apr 7, 2026
42 checks passed

nicoloboschi mentioned this pull request Apr 10, 2026

fix(llm): send max_completion_tokens for reasoning models and Azure OpenAI #979

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: use max_tokens for openai provider with custom base URL (Mistral compatibility)#858

fix: use max_tokens for openai provider with custom base URL (Mistral compatibility)#858
nicoloboschi merged 1 commit intovectorize-io:mainfrom
octo-patch:fix/issue-852-use-max-tokens-for-custom-base-url

octo-patch commented Apr 3, 2026

Uh oh!

nicoloboschi left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

octo-patch commented Apr 3, 2026

Problem

Solution

Testing

Uh oh!

nicoloboschi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants