Skip to content

Commit 258e072

Browse files
authored
Merge pull request #50 from ghostwright/phase2/provider-config
feat: add provider config for multi-backend subprocess routing
2 parents 7b455a7 + 86ebf9a commit 258e072

20 files changed

Lines changed: 1001 additions & 10 deletions

.env.example

Lines changed: 19 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,20 @@
33
# cp .env.example .env
44

55
# ========================
6-
# REQUIRED
6+
# REQUIRED: provider credential
77
# ========================
8+
# Phantom defaults to Anthropic. Set ANTHROPIC_API_KEY for the default setup.
9+
# To use a different provider (Z.AI, OpenRouter, Ollama, vLLM, LiteLLM, custom),
10+
# configure the `provider:` block in phantom.yaml and set the matching env var
11+
# below. See docs/providers.md for the full reference.
812

9-
# Your Anthropic API key (starts with sk-ant-)
1013
ANTHROPIC_API_KEY=
1114

15+
# Alternative provider keys (set one that matches your provider block in phantom.yaml):
16+
# ZAI_API_KEY=
17+
# OPENROUTER_API_KEY=
18+
# LITELLM_KEY=
19+
1220
# ========================
1321
# OPTIONAL: Slack
1422
# ========================
@@ -36,12 +44,20 @@ ANTHROPIC_API_KEY=
3644
# Agent role (default: swe). Options: swe, base
3745
# PHANTOM_ROLE=swe
3846

39-
# Claude model for the agent brain.
47+
# Model for the agent brain. Keep a Claude model ID here even when using a
48+
# non-Anthropic provider: the bundled cli.js has hardcoded capability checks
49+
# against Claude model names. Use `provider.model_mappings` in phantom.yaml
50+
# to redirect the wire call to your actual model (e.g., glm-5.1).
4051
# Options:
4152
# claude-sonnet-4-6 - Fast, capable, lower cost (default, recommended)
4253
# claude-opus-4-6 - Most capable, higher cost
4354
# PHANTOM_MODEL=claude-sonnet-4-6
4455

56+
# Provider override via env var (alternative to editing phantom.yaml).
57+
# Options: anthropic (default), zai, openrouter, vllm, ollama, litellm, custom
58+
# PHANTOM_PROVIDER_TYPE=anthropic
59+
# PHANTOM_PROVIDER_BASE_URL=
60+
4561
# Domain for public URL (e.g., ghostwright.dev)
4662
# When set with PHANTOM_NAME, derives public URL as https://<name>.<domain>
4763
# PHANTOM_DOMAIN=

CLAUDE.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
# Phantom
22

3-
Phantom is an autonomous AI co-worker that runs as a persistent Bun process on a VM. It wraps the Claude Agent SDK (Opus 4.6), maintains vector-backed memory across sessions, rewrites its own configuration through a validated self-evolution engine, communicates via Slack/Telegram/Email/Webhook, and exposes all capabilities as an MCP server. 27,000+ lines of TypeScript, 822 tests, v0.18.2. Apache 2.0, repo at ghostwright/phantom.
3+
Phantom is an autonomous AI co-worker that runs as a persistent Bun process on a VM. It wraps the Claude Agent SDK as a subprocess (Anthropic by default, swappable via a `provider:` config block to Z.AI/GLM-5.1, OpenRouter, Ollama, vLLM, LiteLLM, or any Anthropic Messages API compatible endpoint). It maintains vector-backed memory across sessions, rewrites its own configuration through a validated self-evolution engine, communicates via Slack/Telegram/Email/Webhook, and exposes all capabilities as an MCP server. 27,000+ lines of TypeScript, 875 tests, v0.18.2. Apache 2.0, repo at ghostwright/phantom.
44

55
## Tech Stack
66

77
| Layer | Technology |
88
|-------|-----------|
99
| Runtime | Bun (TypeScript-native, built-in SQLite, no bundler) |
10-
| Agent | Claude Agent SDK (`@anthropic-ai/claude-agent-sdk`) with Opus 4.6, 1M context |
10+
| Agent | Claude Agent SDK (`@anthropic-ai/claude-agent-sdk`) subprocess. Provider is configurable via `src/config/providers.ts`: Anthropic (default), Z.AI, OpenRouter, Ollama, vLLM, LiteLLM, custom. |
1111
| Memory | Qdrant (vector DB, Docker) + Ollama (nomic-embed-text, local embeddings) |
1212
| State | SQLite via Bun (sessions, tasks, metrics, evolution versions, scheduled jobs) |
1313
| Channels | Slack (Socket Mode, primary), Telegram (long polling), Email (IMAP/SMTP), Webhook (HMAC-SHA256), CLI |
@@ -41,7 +41,7 @@ If you find yourself writing a function that does something the agent can do bet
4141

4242
```bash
4343
bun install # Install dependencies
44-
bun test # Run 770 tests
44+
bun test # Run 875 tests
4545
bun run src/index.ts # Start the server
4646
bun run src/cli/main.ts init --yes # Initialize config (reads env vars)
4747
bun run src/cli/main.ts doctor # Check all subsystems

README.md

Lines changed: 30 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77

88
<p align="center">
99
<a href="LICENSE"><img src="https://img.shields.io/badge/license-Apache%202.0-blue.svg" alt="License"></a>
10-
<img src="https://img.shields.io/badge/tests-822%20passed-brightgreen.svg" alt="Tests">
10+
<img src="https://img.shields.io/badge/tests-875%20passed-brightgreen.svg" alt="Tests">
1111
<a href="https://hub.docker.com/r/ghostwright/phantom"><img src="https://img.shields.io/docker/pulls/ghostwright/phantom.svg" alt="Docker Pulls"></a>
1212
<img src="https://img.shields.io/badge/version-0.18.2-orange.svg" alt="Version">
1313
</p>
@@ -75,6 +75,34 @@ A Phantom discovered [Vigil](https://github.com/baudsmithstudios/vigil), a light
7575

7676
This is what happens when you give an AI its own computer.
7777

78+
## Bring Your Own Model
79+
80+
Phantom is not locked to any single AI backend. It ships with support for seven providers out of the box, configured through a single YAML block:
81+
82+
- **Anthropic** (default) - Claude Opus, Sonnet, Haiku
83+
- **Z.AI** - GLM-5.1 and GLM-4.5-Air via [Z.AI's Anthropic-compatible API](https://docs.z.ai/guides/llm/glm-5). Roughly 15x cheaper than Claude Opus for comparable coding quality.
84+
- **OpenRouter** - 100+ models through one key
85+
- **Ollama** - Any GGUF model on your own GPU, zero API cost
86+
- **vLLM** - Self-hosted inference with OpenAI-compatible endpoints
87+
- **LiteLLM** - Local proxy bridging OpenAI, Gemini, and more
88+
- **Custom** - Any Anthropic Messages API compatible endpoint
89+
90+
Switching providers is two lines of YAML:
91+
92+
```yaml
93+
# phantom.yaml
94+
model: claude-sonnet-4-6
95+
provider:
96+
type: zai
97+
api_key_env: ZAI_API_KEY
98+
model_mappings:
99+
sonnet: glm-5.1
100+
```
101+
102+
Set `ZAI_API_KEY` in `.env`, restart, done. Both the main agent and every evolution judge flow through the chosen provider from that point on. The tools are the same, the memory is the same, the self-evolution pipeline is the same. Only the brain changes.
103+
104+
Anthropic stays the default. Existing deployments continue to work with no configuration changes. See [docs/providers.md](docs/providers.md) for the full reference.
105+
78106
## Quick Start
79107

80108
### Docker (recommended)
@@ -186,6 +214,7 @@ Because the agent that can only use pre-built tools hits a ceiling. Phantom buil
186214
| Feature | Why it matters |
187215
|---------|----------------|
188216
| **Its own computer** | Your laptop stays yours. The agent installs software, runs 24/7, and builds infrastructure on its own machine. |
217+
| **Bring your own model** | Anthropic, Z.AI (GLM-5.1), OpenRouter, Ollama, vLLM, LiteLLM, or any Anthropic Messages API compatible endpoint. Pick your backend in YAML, same agent everywhere. |
189218
| **Self-evolution** | The agent rewrites its own config after every session, validated by LLM judges. Day 30 knows things Day 1 didn't. |
190219
| **Persistent memory** | Three tiers of vector memory. Mention something on Monday, it uses it on Wednesday. No re-explaining. |
191220
| **Dynamic tools** | Creates and registers its own MCP tools at runtime. Tools survive restarts and work across sessions. |

config/phantom.yaml

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,3 +16,30 @@ timeout_minutes: 240
1616
# url: https://data.ghostwright.dev/mcp
1717
# token: "bearer-token-for-data"
1818
# description: "Data Analyst Phantom"
19+
20+
# Provider selection. Defaults to Anthropic. Uncomment and customize to use
21+
# a different backend. Both the main agent and every LLM judge flow through
22+
# the chosen provider; authentication happens at the env var named below.
23+
#
24+
# provider:
25+
# type: anthropic # anthropic | zai | openrouter | vllm | ollama | litellm | custom
26+
# # api_key_env: ANTHROPIC_API_KEY
27+
#
28+
# Example: GLM-5.1 via Z.AI's Anthropic-compatible API (15x cheaper than Opus)
29+
# provider:
30+
# type: zai
31+
# api_key_env: ZAI_API_KEY
32+
# model_mappings:
33+
# opus: glm-5.1
34+
# sonnet: glm-5.1
35+
# haiku: glm-4.5-air
36+
#
37+
# Example: Local vLLM server hosting any OpenAI-compatible model
38+
# provider:
39+
# type: vllm
40+
# base_url: http://localhost:8000
41+
#
42+
# Example: Local Ollama (free, runs on your GPU)
43+
# provider:
44+
# type: ollama
45+
# base_url: http://localhost:11434

docs/getting-started.md

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,25 @@ Open `.env` in your editor and fill in these values:
7474
ANTHROPIC_API_KEY=sk-ant-your-key-here
7575
```
7676

77-
Your Anthropic API key. This is the only value you absolutely must set.
77+
Your Anthropic API key. This is the only value you absolutely must set for the default setup.
78+
79+
**Using a different provider?** Phantom supports Z.AI (GLM-5.1, ~15x cheaper than Claude Opus), OpenRouter, Ollama, vLLM, LiteLLM, and custom endpoints. For example, to run Phantom on Z.AI:
80+
81+
```
82+
ZAI_API_KEY=your-zai-key
83+
```
84+
85+
Then add this to `phantom.yaml`:
86+
87+
```yaml
88+
provider:
89+
type: zai
90+
api_key_env: ZAI_API_KEY
91+
model_mappings:
92+
sonnet: glm-5.1
93+
```
94+
95+
See [docs/providers.md](providers.md) for the full provider reference.
7896
7997
### Slack (recommended)
8098

docs/providers.md

Lines changed: 188 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,188 @@
1+
# Provider Configuration
2+
3+
Phantom routes every LLM query (the main agent and every evolution judge) through the Claude Agent SDK as a subprocess. By setting environment variables that the bundled `cli.js` already honors, you can point that subprocess at any Anthropic Messages API compatible endpoint without changing a line of code.
4+
5+
The `provider:` block in `phantom.yaml` is a small config surface that translates into those environment variables for you.
6+
7+
## Supported Providers
8+
9+
| Type | Base URL | API Key Env | Notes |
10+
|------|----------|-------------|-------|
11+
| `anthropic` (default) | `https://api.anthropic.com` | `ANTHROPIC_API_KEY` | Claude Opus, Sonnet, Haiku |
12+
| `zai` | `https://api.z.ai/api/anthropic` | `ZAI_API_KEY` | GLM-5.1 and GLM-4.5-Air, roughly 15x cheaper than Opus |
13+
| `openrouter` | `https://openrouter.ai/api/v1` | `OPENROUTER_API_KEY` | 100+ models through a single key |
14+
| `vllm` | `http://localhost:8000` | none | Self-hosted OpenAI-compatible inference |
15+
| `ollama` | `http://localhost:11434` | none | Local GGUF models, zero API cost |
16+
| `litellm` | `http://localhost:4000` | `LITELLM_KEY` | Local proxy bridging OpenAI, Gemini, and others |
17+
| `custom` | (you set it) | (you set it) | Any Anthropic Messages API compatible endpoint |
18+
19+
## Quick Reference
20+
21+
### Anthropic (default)
22+
23+
No configuration needed. Existing deployments continue to work unchanged.
24+
25+
```yaml
26+
# phantom.yaml
27+
model: claude-sonnet-4-6
28+
# No provider block = defaults to anthropic
29+
```
30+
31+
```bash
32+
# .env
33+
ANTHROPIC_API_KEY=sk-ant-...
34+
```
35+
36+
### Z.AI / GLM-5.1
37+
38+
Z.AI provides an Anthropic Messages API compatible endpoint at `https://api.z.ai/api/anthropic`. Phantom ships with a `zai` preset that points there automatically. Get a key at [docs.z.ai](https://docs.z.ai/guides/llm/glm-5).
39+
40+
```yaml
41+
# phantom.yaml
42+
model: claude-sonnet-4-6
43+
provider:
44+
type: zai
45+
api_key_env: ZAI_API_KEY
46+
model_mappings:
47+
opus: glm-5.1
48+
sonnet: glm-5.1
49+
haiku: glm-4.5-air
50+
```
51+
52+
```bash
53+
# .env
54+
ZAI_API_KEY=<your-zai-key>
55+
```
56+
57+
Both the main agent and every evolution judge route through Z.AI. The `claude-sonnet-4-6` model name is translated to `glm-5.1` on the wire by the `model_mappings` block.
58+
59+
### Ollama (local, free)
60+
61+
Run any GGUF model on your own GPU. No API key needed.
62+
63+
```yaml
64+
# phantom.yaml
65+
model: claude-sonnet-4-6
66+
provider:
67+
type: ollama
68+
model_mappings:
69+
opus: qwen3-coder:32b
70+
sonnet: qwen3-coder:32b
71+
haiku: qwen3-coder:14b
72+
```
73+
74+
Ollama must be running at `http://localhost:11434` (the preset default). The model must support function calling to work with Phantom's agent loop.
75+
76+
### vLLM (self-hosted)
77+
78+
For organizations running their own inference clusters.
79+
80+
```yaml
81+
# phantom.yaml
82+
model: claude-sonnet-4-6
83+
provider:
84+
type: vllm
85+
base_url: http://your-vllm-server:8000
86+
model_mappings:
87+
sonnet: your-model-name
88+
timeout_ms: 300000 # local models can be slow on first call
89+
```
90+
91+
Start vLLM with `--tool-call-parser` matching your model for tool use to work.
92+
93+
### OpenRouter
94+
95+
Access 100+ models through a single OpenRouter key.
96+
97+
```yaml
98+
# phantom.yaml
99+
model: claude-sonnet-4-6
100+
provider:
101+
type: openrouter
102+
api_key_env: OPENROUTER_API_KEY
103+
model_mappings:
104+
sonnet: anthropic/claude-sonnet-4.5
105+
```
106+
107+
### LiteLLM (proxy)
108+
109+
Run a local LiteLLM proxy to bridge OpenAI, Gemini, and other formats.
110+
111+
```yaml
112+
# phantom.yaml
113+
model: claude-sonnet-4-6
114+
provider:
115+
type: litellm
116+
api_key_env: LITELLM_KEY
117+
# base_url defaults to http://localhost:4000
118+
```
119+
120+
### Custom endpoint
121+
122+
For any Anthropic Messages API compatible proxy (LM Studio, custom internal gateways, etc.).
123+
124+
```yaml
125+
# phantom.yaml
126+
model: claude-sonnet-4-6
127+
provider:
128+
type: custom
129+
base_url: https://your-proxy.internal/anthropic
130+
api_key_env: YOUR_CUSTOM_KEY_ENV
131+
```
132+
133+
## Configuration Fields
134+
135+
| Field | Type | Default | Purpose |
136+
|-------|------|---------|---------|
137+
| `type` | enum | `anthropic` | One of the supported provider types |
138+
| `base_url` | URL | preset default | Override the endpoint URL |
139+
| `api_key_env` | string | preset default | Name of the env var holding the credential |
140+
| `model_mappings.opus` | string | none | Concrete model ID for the opus tier |
141+
| `model_mappings.sonnet` | string | none | Concrete model ID for the sonnet tier |
142+
| `model_mappings.haiku` | string | none | Concrete model ID for the haiku tier |
143+
| `disable_betas` | boolean | preset default | Sets `CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS=1`. Defaulted true for every non-anthropic preset. |
144+
| `timeout_ms` | number | none | Sets `API_TIMEOUT_MS` for slow local inference |
145+
146+
## Environment Variable Overrides
147+
148+
For operators who prefer env variables over YAML edits:
149+
150+
| Variable | Effect |
151+
|----------|--------|
152+
| `PHANTOM_PROVIDER_TYPE` | Override `provider.type` (validated against the supported values) |
153+
| `PHANTOM_PROVIDER_BASE_URL` | Override `provider.base_url` (validated as a URL) |
154+
| `PHANTOM_MODEL` | Override `config.model` |
155+
156+
These are applied on top of the YAML-loaded config during startup.
157+
158+
## How It Works
159+
160+
The Claude Agent SDK runs as a subprocess. The SDK's bundled `cli.js` reads `ANTHROPIC_BASE_URL` and the `ANTHROPIC_DEFAULT_*_MODEL` aliases at call time. When `ANTHROPIC_BASE_URL` points at a non-Anthropic host, all Messages API requests go there instead.
161+
162+
The `provider:` block is translated into those environment variables by `buildProviderEnv()` in [`src/config/providers.ts`](../src/config/providers.ts). The resulting map is merged into both the main agent query and the evolution judge query, so changing providers flips both tiers in lockstep.
163+
164+
## Why keep a Claude model name in `model:`?
165+
166+
The bundled `cli.js` has hardcoded model-name arrays for capability detection (thinking tokens, effort levels, compaction, etc.). Passing a literal `glm-5.1` as the model can break those checks. The recommended pattern is:
167+
168+
1. Set `model: claude-sonnet-4-6` (or Opus) in `phantom.yaml` so `cli.js` treats the call as a known Claude model
169+
2. Set `model_mappings.sonnet: glm-5.1` in the provider block so the wire call goes to GLM-5.1
170+
171+
This is the same pattern Z.AI's own documentation recommends.
172+
173+
## Troubleshooting
174+
175+
**Phantom responds but the logs show Claude-shaped costs.**
176+
The bundled `cli.js` calculates `total_cost_usd` from its local Claude pricing table based on the model name string. Cost reporting is not provider-aware, so the logged cost will look like Claude pricing even when the request went to Z.AI or another provider. The actual charge on your provider's bill will differ.
177+
178+
**Auto mode judges fall back to heuristic mode.**
179+
`resolveJudgeMode` in auto mode enables LLM judges when any of these are true: (a) a non-anthropic provider is configured, (b) `provider.base_url` is set, (c) `ANTHROPIC_API_KEY` is present, or (d) `~/.claude/.credentials.json` exists. If none hold, judges run in heuristic mode. Set `judges.enabled: always` in `config/evolution.yaml` to force LLM judges on.
180+
181+
**Third-party proxy rejects a beta header.**
182+
`disable_betas: true` is already the default for every non-anthropic preset, which sets `CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS=1`. If you still see beta header errors, explicitly set `disable_betas: true` on your provider block to make sure it overrides any custom `disable_betas: false`.
183+
184+
**Tool calls fail with small local models.**
185+
Phantom's tool system assumes strong function-calling capability. Models like Qwen3-Coder and GLM-5.1 handle it well; smaller models often fail on complex multi-step tool chains. Test with a strong model first, then drop down.
186+
187+
**Subprocess fails with a missing-credential error.**
188+
Phantom does not validate credentials at load time. The subprocess only sees the provider env vars when a query runs. If `api_key_env` names a variable that is not set in the process environment, the subprocess will fail at call time with the provider's own error message.

src/agent/__tests__/prompt-assembler.test.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ const baseConfig: PhantomConfig = {
77
port: 3100,
88
role: "swe",
99
model: "claude-opus-4-6",
10+
provider: { type: "anthropic" },
1011
effort: "max",
1112
max_budget_usd: 0,
1213
timeout_minutes: 240,

0 commit comments

Comments
 (0)