Skip to content

Azure provider: cannot target chat/completions (404 on responses); need explicit config/docs for Azure wire and path construction #2025

@santiago-afonso

Description

@santiago-afonso

What version of Codex is running?

codex-cli 0.19.0

Which model were you using?

gpt-5

What platform is your computer?

Linux 6.6.87.1-microsoft-standard-WSL2 x86_64 x86_64

What steps can reproduce the bug?

Environment

  • Codex CLI: v0.19 (Rust)
  • OS: Linux (x86-64)
  • Auth: Entra ID (AAD) access token via client credentials (raw JWT in env var)
  • Azure target: Azure OpenAI fronted by API Management (APIM)
  • Config method: ~/.codex/config.toml + --profile
  • Model: Azure deployment name (not base model)

Summary
Codex CLI’s Azure provider appears to call the Responses API path:

/openai/deployments/{deployment}/responses?api-version=...

Our APIM only exposes Chat Completions:

/openai/deployments/{deployment}/chat/completions?api-version=...

Result: Codex returns 404 Not Found consistently, while curl and the Python SDK succeed using chat/completions.

What works

  • Mint Entra ID access token (client credentials).
  • curl to POST .../openai/deployments/<DEPLOY>/chat/completions?api-version=2025-04-01-preview returns 200 OK with a normal response.
  • Python AzureOpenAI + chat.completions.create(...) returns 200 OK against the same endpoint.

What fails

  • Codex CLI (with Azure provider + profile) returns 404 Not Found repeatedly.
  • Hitting .../openai/deployments/<DEPLOY>/responses?api-version=... with curl also returns 404 on our APIM (that path isn’t exposed).

Minimal (sanitized) config

# ~/.codex/config.toml
model_provider = "azure"
model = "<AZURE_DEPLOYMENT_NAME>"

[model_providers.azure]
name = "Azure"
# Base ends in /openai so provider appends /deployments/{model}/...
base_url = "https://<YOUR-APIM-DOMAIN>/openai"
query_params = { api-version = "2025-04-01-preview" }
# We need to force Chat Completions here, but there’s no documented knob.
# (Tried variations; CLI still hits /responses.)

[profiles.az-qa]
model_provider = "azure"
model = "<AZURE_DEPLOYMENT_NAME>"

Invocation

export AZURE_OPENAI_API_KEY="<RAW_JWT_FROM_AAD>"   # no "Bearer " prefix
codex --profile az-qa -m "<AZURE_DEPLOYMENT_NAME>" "say hello"
# => 404 Not Found

Logs (excerpt)

... INFO BackgroundEvent: stream error: unexpected status 404 Not Found: {"statusCode":404,"message":"Resource not found"}
... WARN stream disconnected - retrying ...

Repro (working vs failing)

# Working (Chat Completions)
curl -sS -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -X POST "https://<YOUR-APIM-DOMAIN>/openai/deployments/<DEPLOY>/chat/completions?api-version=2025-04-01-preview" \
  -d '{"messages":[{"role":"user","content":"ping"}]}'
# => 200 OK

# Failing (Responses)
curl -sS -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -X POST "https://<YOUR-APIM-DOMAIN>/openai/deployments/<DEPLOY>/responses?api-version=2025-04-01-preview" \
  -d '{"input":[{"role":"user","content":"ping"}]}'
# => 404 Not Found

Expected

  • Either:

    1. A documented configuration option to select the Azure wire:

      • e.g., wire_api = "chat.completions" vs wire_api = "responses" in the Azure provider block; or
    2. A path-template override (e.g., path_template = "/deployments/{model}/chat/completions")

  • And documentation that clearly shows:

    • base_url must end with /openai.
    • model must be the Azure deployment name.
    • How to set api-version via query_params.
    • How to provide an AAD access token (raw JWT vs "Bearer "), plus whether API keys are supported and which header the CLI sends.

Actual

  • CLI appears to target /responses with no way (documented) to force chat/completions, leading to 404s on Azure setups that only expose Chat Completions via APIM.

Requests

  1. Bug/feature: Add a documented way to select the Azure wire (Chat Completions vs Responses) or to override the path template for Azure.

  2. Docs: Provide an end-to-end Azure section with:

    • Minimal TOML for Azure.
    • AAD Device Code and Client Credentials examples that mint a token and run Codex.
    • Exact path shapes Codex will call for each wire.
    • api-version configuration.
    • Troubleshooting guidance (401 vs 404, where logs live, how to confirm the final URL).

Why this matters
Many Azure/APIM deployments expose Chat Completions but not Responses. Without a documented knob to choose the wire (or override the path), Codex can’t interoperate with otherwise standard Azure OpenAI setups.

Thanks!

What is the expected behavior?

No response

What do you see instead?

No response

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    azureIssues related to the Azure-hosted OpenAI modelsbugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions