Skip to content

Commit f302cee

Browse files
langgraph works with antrhopic API
1 parent 8a3d650 commit f302cee

6 files changed

Lines changed: 1102 additions & 1 deletion

File tree

python/evals/eval_agent.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,7 @@
4949
# Each entry: (module_path_relative_to_starter_dir, class_name)
5050
ADAPTERS: dict[str, tuple[str, str]] = {
5151
"claude": ("claude.agent", "ClaudeAdapter"),
52+
"langgraph": ("langgraph.agent", "LangGraphAdapter"),
5253
"beginner": ("beginner.agent", "Agent"),
5354
"intermediate": ("intermediate.agent", "Agent"),
5455
"llm": ("llm.agent", "Agent"),
@@ -92,7 +93,7 @@ def load_agent(adapter_name: str, model: str | None = None):
9293
# Instantiate with model if the adapter accepts it
9394
if adapter_name == "llm" and model:
9495
return cls(model_path=model)
95-
elif adapter_name == "claude" and model:
96+
elif adapter_name in ("claude", "langgraph") and model:
9697
return cls(model=model)
9798
else:
9899
return cls()

starters/langgraph/README.md

Lines changed: 289 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,289 @@
1+
# LangGraph Starter — Learn LangGraph by Building a Game Agent
2+
3+
This starter teaches you **LangGraph** by building an AI agent that plays Agent Arena scenarios.
4+
5+
**Key idea:** "Want to learn LangGraph? Build an AI agent that plays a game."
6+
7+
## What You'll Learn
8+
9+
- **Graph construction** — StateGraph, adding nodes, connecting edges
10+
- **State schema** — TypedDict with message reducers (`add_messages`)
11+
- **Tool binding** — Giving the LLM structured tools via `bind_tools()`
12+
- **Conditional routing** — Branching the graph based on LLM output
13+
- **ReAct pattern** — The observe-think-act loop as a graph
14+
- **Message types** — SystemMessage, HumanMessage, AIMessage
15+
16+
## Prerequisites
17+
18+
1. An **Anthropic API key** — get one at [console.anthropic.com](https://console.anthropic.com)
19+
2. Python 3.11+
20+
3. Agent Arena game (Godot) running
21+
22+
## Quick Start
23+
24+
```bash
25+
# 1. Set your API key
26+
export ANTHROPIC_API_KEY=sk-ant-...
27+
28+
# 2. Install dependencies
29+
pip install -r requirements.txt
30+
31+
# 3. Start the agent
32+
python run.py
33+
34+
# 4. In Godot: open scenes/foraging.tscn -> F5 -> SPACE
35+
```
36+
37+
Your agent will start making decisions using a LangGraph agent graph!
38+
39+
## Files
40+
41+
| File | What it does |
42+
|------|-------------|
43+
| `agent.py` | `LangGraphAdapter` — builds the agent graph, invokes it, extracts decisions |
44+
| `run.py` | Entry point — parses args, creates adapter, starts server |
45+
| `requirements.txt` | Dependencies (agent-arena-sdk, langgraph, langchain-anthropic) |
46+
47+
## How It Works
48+
49+
Each game tick:
50+
51+
```
52+
Godot sends Observation (what the agent sees)
53+
|
54+
LangGraphAdapter.format_observation() -> text context
55+
|
56+
Graph invoked with [SystemMessage, HumanMessage]
57+
|
58+
v
59+
+-------+ tool call? +-------+
60+
| agent | ----YES--------> | tools | --> END
61+
| | ----NO---------> END |
62+
+-------+ +-------+
63+
| |
64+
LLM reads context No-op passthrough
65+
+ tool definitions (game executes action)
66+
|
67+
v
68+
Extract tool call from AIMessage -> Decision
69+
|
70+
Decision sent back to Godot
71+
```
72+
73+
### The Key Concepts
74+
75+
#### 1. StateGraph — The Foundation
76+
77+
Everything in LangGraph starts with a `StateGraph`. It defines what data flows through the graph (the "state") and how nodes transform it:
78+
79+
```python
80+
class AgentState(TypedDict):
81+
messages: Annotated[list, add_messages]
82+
83+
graph = StateGraph(AgentState)
84+
```
85+
86+
The `add_messages` annotation is a **reducer** — it tells LangGraph to *append* new messages rather than replacing the list. This is how conversation history builds up.
87+
88+
#### 2. Nodes — Processing Steps
89+
90+
Nodes are functions that take the current state and return updates:
91+
92+
```python
93+
def agent_node(state: AgentState) -> dict:
94+
response = llm_with_tools.invoke(state["messages"])
95+
return {"messages": [response]} # Appended via add_messages
96+
97+
graph.add_node("agent", agent_node)
98+
```
99+
100+
#### 3. Conditional Edges — Decision Routing
101+
102+
After a node runs, conditional edges inspect the state and choose the next node:
103+
104+
```python
105+
def should_continue(state: AgentState) -> str:
106+
last_message = state["messages"][-1]
107+
if last_message.tool_calls:
108+
return "tools" # LLM called a tool
109+
return END # LLM just returned text
110+
111+
graph.add_conditional_edges("agent", should_continue, ...)
112+
```
113+
114+
#### 4. Tool Binding — Structured Actions
115+
116+
`bind_tools()` attaches tool definitions to the LLM so it can call them with typed parameters:
117+
118+
```python
119+
tools = [schema.to_openai_format() for schema in self.get_action_tools()]
120+
llm_with_tools = llm.bind_tools(tools)
121+
```
122+
123+
The LLM responds with an `AIMessage` containing `tool_calls`:
124+
125+
```python
126+
ai_message.tool_calls[0]
127+
# {"name": "move_to", "args": {"target_position": [10.0, 0.0, 5.0]}}
128+
```
129+
130+
#### 5. Single-Action vs Multi-Step
131+
132+
In a standard ReAct agent, the `tools` node executes the tool and routes back to `agent` for another round. In Agent Arena, each tick is one action, so:
133+
134+
```python
135+
graph.add_edge("tools", END) # Stop after one tool call
136+
# vs.
137+
# graph.add_edge("tools", "agent") # Loop for multi-step reasoning
138+
```
139+
140+
## Customization
141+
142+
### Change the System Prompt
143+
144+
Edit `SYSTEM_PROMPT` at the top of `agent.py`. Try:
145+
- Adding personality ("You are a cautious agent that avoids all risk")
146+
- Changing strategy ("Always explore before collecting")
147+
- Adding domain knowledge ("Fire hazards deal 10 damage per tick")
148+
149+
### Change the Model
150+
151+
```bash
152+
python run.py --model claude-haiku-4-5-20251001 # Fastest, cheapest
153+
python run.py --model claude-sonnet-4-20250514 # Balanced (default)
154+
python run.py --model claude-opus-4-20250514 # Most capable
155+
```
156+
157+
### Swap to OpenAI
158+
159+
1. Update `requirements.txt`:
160+
```
161+
langchain-openai>=0.3.0 # Replace langchain-anthropic
162+
```
163+
164+
2. In `agent.py`, change the import and LLM creation:
165+
```python
166+
from langchain_openai import ChatOpenAI
167+
168+
self.llm = ChatOpenAI(
169+
model="gpt-4o",
170+
max_tokens=max_tokens,
171+
api_key=api_key or os.environ.get("OPENAI_API_KEY"),
172+
)
173+
```
174+
175+
Everything else stays the same — LangGraph abstracts the LLM provider.
176+
177+
### Add Memory (Checkpointing)
178+
179+
LangGraph has built-in memory via checkpointers. Add state persistence across ticks:
180+
181+
```python
182+
from langgraph.checkpoint.memory import MemorySaver
183+
184+
checkpointer = MemorySaver()
185+
self.graph = self._build_graph() # Returns compiled graph
186+
# Recompile with checkpointer:
187+
self.graph = graph.compile(checkpointer=checkpointer)
188+
189+
# Invoke with a thread_id to maintain conversation history:
190+
result = self.graph.invoke(
191+
{"messages": [HumanMessage(content=obs_text)]},
192+
config={"configurable": {"thread_id": "agent-1"}},
193+
)
194+
```
195+
196+
### Add Multi-Step Reasoning
197+
198+
To let the agent reason across multiple tool calls before acting (e.g., query memory then decide), change the graph to loop:
199+
200+
```python
201+
# Instead of: graph.add_edge("tools", END)
202+
graph.add_edge("tools", "agent") # Loop back for another round
203+
```
204+
205+
Then add "query" tools (spatial memory, episode memory) alongside the action tools. The agent will call query tools to gather info, then call an action tool to act.
206+
207+
### Restructure the Graph
208+
209+
Add new nodes for preprocessing, memory, or planning:
210+
211+
```python
212+
graph.add_node("preprocess", preprocess_node) # Clean observation
213+
graph.add_node("memory", memory_node) # Query past experiences
214+
graph.add_node("agent", agent_node) # LLM decision
215+
graph.add_node("tools", tools_node) # Execute tools
216+
217+
graph.set_entry_point("preprocess")
218+
graph.add_edge("preprocess", "memory")
219+
graph.add_edge("memory", "agent")
220+
# ... conditional edges for agent -> tools
221+
```
222+
223+
## Cost Estimation
224+
225+
Each tick costs approximately (using Anthropic via LangChain):
226+
- **Haiku**: ~0.1 cent (500 input + 100 output tokens)
227+
- **Sonnet**: ~0.5 cent
228+
- **Opus**: ~2.5 cents
229+
230+
A typical foraging run (100 ticks) costs ~$0.10 with Sonnet.
231+
232+
## Debugging
233+
234+
### Enable Debug Viewer
235+
236+
```bash
237+
python run.py --debug
238+
# Open http://127.0.0.1:5000/debug in your browser
239+
```
240+
241+
### View Traces
242+
243+
The adapter records each decision in `self.last_trace` with:
244+
- System prompt sent
245+
- Observation context sent
246+
- Tokens used
247+
- Parse method (tool_call, fallback, error)
248+
- Final decision
249+
250+
### LangSmith Integration
251+
252+
LangGraph integrates natively with [LangSmith](https://smith.langchain.com/) for tracing:
253+
254+
```bash
255+
export LANGCHAIN_TRACING_V2=true
256+
export LANGCHAIN_API_KEY=ls-...
257+
python run.py
258+
```
259+
260+
Every graph invocation will appear in the LangSmith dashboard with full execution traces.
261+
262+
### Common Issues
263+
264+
**"LLM did not call a tool"** — The LLM sometimes returns text without calling a tool. The adapter falls back to observation-based logic. Try making the system prompt more directive.
265+
266+
**High latency** — Each tick requires an API round-trip. Use Haiku for faster responses.
267+
268+
**"ANTHROPIC_API_KEY not set"** — Export your API key: `export ANTHROPIC_API_KEY=sk-ant-...`
269+
270+
## Comparison with Claude Starter
271+
272+
| Feature | Claude Starter | LangGraph Starter |
273+
|---------|---------------|-------------------|
274+
| Approach | Direct Anthropic API | Graph-based agent |
275+
| State management | Manual | LangGraph state + reducers |
276+
| Tool format | Anthropic native | OpenAI function calling |
277+
| Extensibility | Modify `decide()` | Add nodes and edges |
278+
| Memory | Manual implementation | Built-in checkpointers |
279+
| Multi-step reasoning | Not built-in | Native (loop tools -> agent) |
280+
| Observability | Custom traces | LangSmith integration |
281+
| LLM provider | Anthropic only | Any LangChain chat model |
282+
283+
## Next Steps
284+
285+
- Add **checkpointing** for memory across ticks
286+
- Add **query tools** (spatial memory, episode memory) as non-terminal reasoning steps
287+
- Enable **multi-step reasoning** by routing `tools -> agent` in the graph
288+
- Try the **`create_react_agent`** shortcut: `from langgraph.prebuilt import create_react_agent`
289+
- Read the [LangGraph docs](https://langchain-ai.github.io/langgraph/) to learn more

0 commit comments

Comments
 (0)