-
Notifications
You must be signed in to change notification settings - Fork 645
Description
Description
I’m using the ART LangGraph integration and the init_chat_model function with a local Ollama server as the model backend for my agent. The relevant code in my rollout function looks like this:
chat_model = init_chat_model(model.name, temperature=1.0)
react_agent = create_react_agent(chat_model, tools)I also set up the art.Model to point to my local Ollama server for inference (base URL + any key if needed). However, looking at the implementation of init_chat_model:
def init_chat_model(...):
config = CURRENT_CONFIG.get()
return LoggingLLM(
ChatOpenAI(
base_url=config["base_url"],
api_key=config["api_key"],
model=config["model"],
temperature=1.0,
),
config["logger"],
)Several issues arise:
-
Arguments are ignored
- The
model,model_provider,configurable_fields,config_prefix, and**kwargsare not used at all. - This is surprising because I call
init_chat_model(model.name, temperature=1.0), but thatmodel.nameandtemperatureare ignored. - Everything is driven purely by the
CURRENT_CONFIGcontextvar.
- The
-
Hard binding to
ChatOpenAIinit_chat_modelalways creates aChatOpenAIinstance, even if I’m not using OpenAI’s API.- When using a local Ollama server, it’s still bound to
ChatOpenAIrather than accepting a generic LangChain chat model (e.g.ChatOllamaorChatNVIDIA).
-
Unexpected OpenAI calls / tight OpenAI coupling
-
Even after pointing the inference base URL to my local Ollama, I still see attempts to call OpenAI endpoints somewhere in the pipeline (especially for judging / RULER).
-
When there’s a runtime error or long inference, the hardcoded 10-minute timeout in
LoggingLLM.ainvokegets hit:result = await asyncio.wait_for( self.llm.ainvoke(input, config=config), timeout=10 * 60 )
-
This causes the agent to fail with a timeout, even though:
- I’m using a local Ollama server.
- I would like to configure a different timeout for long rollouts.
-
What I expect
-
init_chat_modelshould either:- Use its arguments (
model,model_provider, etc.) and/or accept an explicitChatModelinstance; or - Have a clearer contract that it depends entirely on
CURRENT_CONFIGand is OpenAI-only.
- Use its arguments (
-
It should be possible to plug in:
- Local Ollama (
ChatOllamaor a Litellm-based wrapper) - Other OpenAI-compatible endpoints
- Without silently falling back to OpenAI-specific assumptions.
- Local Ollama (
-
The 10-minute timeout should either be:
- Configurable; or
- Documented with guidance on how to override it.
What actually happens
-
Even after configuring a local Ollama backend and setting inference URLs, ART:
- Uses
ChatOpenAIinternally. - Still exhibits OpenAI-specific behavior.
- Hits the hard-coded 10-minute timeout during agent inference / error cases.
- Uses
Request
-
Please make
init_chat_modelprovider-agnostic, or introduce a way to:- Pass in a custom
ChatModel(e.g.ChatOllama,ChatNVIDIA). - Configure the timeout instead of hardcoding
10 * 60.
- Pass in a custom
-
Alternatively, document the intended usage pattern if this function is meant strictly for OpenAI-style backends.