Skip to content

init_chat_model always uses ChatOpenAI, ignores args, and still calls OpenAI / hits 10-minute timeout with Ollama #474

@ansh-info

Description

@ansh-info

Description

I’m using the ART LangGraph integration and the init_chat_model function with a local Ollama server as the model backend for my agent. The relevant code in my rollout function looks like this:

chat_model = init_chat_model(model.name, temperature=1.0)
react_agent = create_react_agent(chat_model, tools)

I also set up the art.Model to point to my local Ollama server for inference (base URL + any key if needed). However, looking at the implementation of init_chat_model:

def init_chat_model(...):
    config = CURRENT_CONFIG.get()
    return LoggingLLM(
        ChatOpenAI(
            base_url=config["base_url"],
            api_key=config["api_key"],
            model=config["model"],
            temperature=1.0,
        ),
        config["logger"],
    )

Several issues arise:

  1. Arguments are ignored

    • The model, model_provider, configurable_fields, config_prefix, and **kwargs are not used at all.
    • This is surprising because I call init_chat_model(model.name, temperature=1.0), but that model.name and temperature are ignored.
    • Everything is driven purely by the CURRENT_CONFIG contextvar.
  2. Hard binding to ChatOpenAI

    • init_chat_model always creates a ChatOpenAI instance, even if I’m not using OpenAI’s API.
    • When using a local Ollama server, it’s still bound to ChatOpenAI rather than accepting a generic LangChain chat model (e.g. ChatOllama or ChatNVIDIA).
  3. Unexpected OpenAI calls / tight OpenAI coupling

    • Even after pointing the inference base URL to my local Ollama, I still see attempts to call OpenAI endpoints somewhere in the pipeline (especially for judging / RULER).

    • When there’s a runtime error or long inference, the hardcoded 10-minute timeout in LoggingLLM.ainvoke gets hit:

      result = await asyncio.wait_for(
          self.llm.ainvoke(input, config=config), timeout=10 * 60
      )
    • This causes the agent to fail with a timeout, even though:

      • I’m using a local Ollama server.
      • I would like to configure a different timeout for long rollouts.

What I expect

  • init_chat_model should either:

    • Use its arguments (model, model_provider, etc.) and/or accept an explicit ChatModel instance; or
    • Have a clearer contract that it depends entirely on CURRENT_CONFIG and is OpenAI-only.
  • It should be possible to plug in:

    • Local Ollama (ChatOllama or a Litellm-based wrapper)
    • Other OpenAI-compatible endpoints
    • Without silently falling back to OpenAI-specific assumptions.
  • The 10-minute timeout should either be:

    • Configurable; or
    • Documented with guidance on how to override it.

What actually happens

  • Even after configuring a local Ollama backend and setting inference URLs, ART:

    • Uses ChatOpenAI internally.
    • Still exhibits OpenAI-specific behavior.
    • Hits the hard-coded 10-minute timeout during agent inference / error cases.

Request

  • Please make init_chat_model provider-agnostic, or introduce a way to:

    • Pass in a custom ChatModel (e.g. ChatOllama, ChatNVIDIA).
    • Configure the timeout instead of hardcoding 10 * 60.
  • Alternatively, document the intended usage pattern if this function is meant strictly for OpenAI-style backends.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions