Local TypeScript proxy that lets Codex CLI talk to Cloudflare Workers AI models through an OpenAI-compatible Responses API surface.
- Quick Start
- Prerequisites
- Installation
- Usage with Codex
- Recommended Models
- Running as a Daemon
- API Endpoints
- Prompt Caching
- Troubleshooting
- Development
- Limitations
git clone https://github.com/pitzcarraldo/codex-workers-ai-proxy.git
cd codex-workers-ai-proxy
cp .env.example .env
# Edit .env with your Cloudflare credentials:
# CLOUDFLARE_ACCOUNT_ID=<your-account-id>
# CLOUDFLARE_API_TOKEN=<your-api-token>
npm startThen, in another terminal:
curl -fsS http://127.0.0.1:4319/codex/model-catalog.json -o /tmp/codex-model-catalog.json
codex \
-m @cf/moonshotai/kimi-k2.5 \
-c 'model_provider="cloudflare_proxy"' \
-c 'model_catalog_json="/tmp/codex-model-catalog.json"' \
-c 'model_providers.cloudflare_proxy.name="Cloudflare Proxy"' \
-c 'model_providers.cloudflare_proxy.base_url="http://127.0.0.1:4319/v1"' \
-c 'model_providers.cloudflare_proxy.wire_api="responses"' \
-c 'model_providers.cloudflare_proxy.supports_websockets=false'That's it. For a more ergonomic setup, see Usage with Codex.
- Node.js (runs
.tsentrypoints directly, no transpilation needed) or Bun - Cloudflare account with Workers AI enabled
- Go to Cloudflare Dashboard
- Account ID -- found in the zone dashboard sidebar
- API Token -- create one with
Workers AI:Edit, Workers AI:Readpermissions
git clone https://github.com/pitzcarraldo/codex-workers-ai-proxy.git
cd codex-workers-ai-proxy
cp .env.example .envEdit .env with your credentials:
CLOUDFLARE_ACCOUNT_ID=<your-account-id>
CLOUDFLARE_API_TOKEN=<your-api-token>Start the server:
npm start # Node.js
npm run start:bun # BunDefault address: http://127.0.0.1:4319
Add both functions to ~/.bashrc or ~/.zshrc:
codexcf_start() {
local proxy_dir="${1:-$(pwd)}"
local proxy_port="${CODEXCF_PROXY_PORT:-4319}"
local proxy_log="${CODEXCF_PROXY_LOG:-$HOME/.codex-workers-ai-proxy.log}"
local proxy_pid_file="${CODEXCF_PROXY_PID_FILE:-$HOME/.codex-workers-ai-proxy.pid}"
if command -v lsof >/dev/null 2>&1 && lsof -nP -iTCP:"$proxy_port" -sTCP:LISTEN >/dev/null 2>&1; then
echo "Proxy already running on port $proxy_port"
return 0
fi
(
cd "$proxy_dir" || exit 1
nohup npm start >> "$proxy_log" 2>&1 &
echo $! > "$proxy_pid_file"
)
echo "Proxy started. Log: $proxy_log"
}
codexcf() {
local proxy_port="${CODEXCF_PROXY_PORT:-4319}"
local default_model="${CODEXCF_MODEL:-@cf/moonshotai/kimi-k2.5}"
local proxy_dir="${CODEXCF_PROXY_DIR:-$HOME/codex-workers-ai-proxy}"
local catalog_path="${CODEXCF_MODEL_CATALOG_PATH:-${TMPDIR:-/tmp}/codexcf-model-catalog.json}"
codexcf_start "$proxy_dir"
curl -fsS "http://127.0.0.1:${proxy_port}/codex/model-catalog.json" -o "$catalog_path" 2>/dev/null
codex \
-m "$default_model" \
-c 'model_provider="cloudflare_proxy"' \
-c "model_catalog_json=\"$catalog_path\"" \
-c 'model_providers.cloudflare_proxy.name="Cloudflare Proxy"' \
-c "model_providers.cloudflare_proxy.base_url=\"http://127.0.0.1:${proxy_port}/v1\"" \
-c 'model_providers.cloudflare_proxy.wire_api="responses"' \
-c 'model_providers.cloudflare_proxy.supports_websockets=false' \
"$@"
}Then:
source ~/.zshrc
codexcf "Hello, how are you?"
codexcf -m @cf/openai/gpt-oss-120b "Explain quantum computing"Add to ~/.codex/config.toml:
[model_providers.cloudflare_proxy]
name = "Cloudflare Workers AI Proxy"
base_url = "http://127.0.0.1:4319/v1"
wire_api = "responses"
supports_websockets = false
[profiles.kimi]
model_provider = "cloudflare_proxy"
model = "@cf/moonshotai/kimi-k2.5"
model_catalog_json = "/tmp/codex-model-catalog.json"Prepare and run:
curl -fsS http://127.0.0.1:4319/codex/model-catalog.json -o /tmp/codex-model-catalog.json
codex -p kimicurl -fsS http://127.0.0.1:4319/codex/model-catalog.json -o /tmp/codex-model-catalog.json
codex \
-m @cf/moonshotai/kimi-k2.5 \
-c 'model_provider="cloudflare_proxy"' \
-c 'model_catalog_json="/tmp/codex-model-catalog.json"' \
-c 'model_providers.cloudflare_proxy.name="Cloudflare Proxy"' \
-c 'model_providers.cloudflare_proxy.base_url="http://127.0.0.1:4319/v1"' \
-c 'model_providers.cloudflare_proxy.wire_api="responses"' \
-c 'model_providers.cloudflare_proxy.supports_websockets=false'Cloudflare Workers AI text generation models ranked by LiveCodeBench performance (as of April 2026):
Cloudflare Workers AI text generation models ranked by LiveCodeBench performance (as of April 2026):
| Model | LiveCodeBench | AA Index | Context | Comparable Claude | Comparable GPT | Cost (Sonnet 4.5=100%) |
|---|---|---|---|---|---|---|
kimi-k2.5 |
~85% | 47 | 256k | Claude Sonnet 4.5 | GPT-5.2 | 20% |
gemma-4-26b-a4b-it |
~77% | ~33 | 256k | Claude Sonnet 4 | GPT-5 mini | 2.5% |
gpt-oss-120b |
~70% | ~37 | 131k | Claude Sonnet 3.7 | o4-mini | 7.5% |
glm-4.7-flash |
~65% | 30 | 200k | Claude 3.5 Sonnet | GPT-4.1 | 2.5% |
cd /path/to/codex-workers-ai-proxy
nohup npm start >> ~/.codex-workers-ai-proxy.log 2>&1 &
echo $! > ~/.codex-workers-ai-proxy.pid# Using PID file
kill "$(cat ~/.codex-workers-ai-proxy.pid)" 2>/dev/null && rm ~/.codex-workers-ai-proxy.pid
# Or kill by port
kill "$(lsof -t -i:4319)" 2>/dev/nulltail -f ~/.codex-workers-ai-proxy.log # Real-time
tail -n 100 ~/.codex-workers-ai-proxy.log # Recent lines
grep "ERROR" ~/.codex-workers-ai-proxy.log # Search errorsTip: If using the Shell wrapper, the daemon management is handled automatically by
codexcf_start.
| Method | Path | Description |
|---|---|---|
| POST | /v1/responses |
Create a response (Responses API) |
| GET | /v1/responses/:id |
Retrieve a response |
| POST | /v1/chat/completions |
Chat completion (OpenAI compat) |
| GET | /v1/models |
List available models |
| GET | /codex/model-catalog.json |
Model catalog for Codex CLI |
| GET | /health |
Health check |
BASE_URL="http://127.0.0.1:4319"
# Health check
curl -s $BASE_URL/health
# List models
curl -s $BASE_URL/v1/models | jq '.data[].id'
# Chat completion
curl -X POST $BASE_URL/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "@cf/moonshotai/kimi-k2.5",
"messages": [{"role": "user", "content": "Hello!"}]
}' | jq .The proxy forwards a stable x-session-affinity value derived from (in priority order):
- Incoming
x-session-affinityheader - Incoming
session_id prompt_cache_keyx-client-request-idprevious_response_id
| Symptom | Solution |
|---|---|
EADDRINUSE |
Port 4319 in use. lsof -i:4319 to check |
401 Unauthorized |
Verify CLOUDFLARE_ACCOUNT_ID and CLOUDFLARE_API_TOKEN |
403 Forbidden |
Token needs Workers AI:Read permission |
| Empty model catalog | Enable Workers AI in Cloudflare Dashboard |
| Connection refused | Start proxy: npm start |
npm start # Foreground with full output
node --inspect src/index.ts # With Node.js debuggerAttach debugger via Chrome DevTools (chrome://inspect) or VS Code "Attach to Process".
# macOS
lsof -nP -iTCP:4319 -sTCP:LISTEN
# Linux
ss -tlnp | grep 4319
# Quick health check
curl -s http://127.0.0.1:4319/health && echo "OK" || echo "Not responding"npm run check # Type checking
npm test # Run tests
npm run dev # Dev mode (Node)
npm run dev:bun # Dev mode (Bun)- Only exposes Workers AI text generation models.
web_searchtools are only translated whenexternal_web_access=true.Responses APIstreaming is translated from Workers AI chat-completions SSE in real time.- Kimi thinking turns keep
reasoning_contentin proxy-side history for the next turn and stream live reasoning deltas to Codex when available.