Secure, cloud-sandboxed Recursive Language Models (RLM) with DSPy and Modal.
fleet-rlm provides a production-ready implementation of Recursive Language Modeling aligned with the DSPy RLM API. It gives your AI agent a secure "computer" in the cloud to read, search, and analyze massive datasets without local resource constraints.
Paper | Contributing | Docs
graph TB
subgraph entry ["πͺ Entry Points"]
CLI["CLI (Typer)"]
API["FastAPI<br/>(WS/REST)"]
TUI["Ink TUI<br/>(stdio bridge)"]
MCP["MCP Server"]
end
subgraph orchestration ["π§ Orchestration Layer"]
Agent["RLMReActChatAgent<br/>(dspy.Module)"]
History["Chat History"]
Memory["Core Memory<br/>(Persona/Human/Scratchpad)"]
DocCache["Document Cache"]
end
subgraph tools ["π§ ReAct Tools"]
DocTools["π load_document<br/>read_file_slice<br/>chunk_by_*"]
RecursiveTools["π rlm_query<br/>llm_query<br/>(recursive delegation)"]
ExecTools["β‘ execute_code<br/>edit_file<br/>search_code"]
end
subgraph execution ["βοΈ Execution Layer"]
Interpreter["ModalInterpreter<br/>(JSON protocol)"]
Profiles["Execution Profiles:<br/>ROOT | DELEGATE | MAINTENANCE"]
end
subgraph cloud ["βοΈ Modal Cloud"]
Sandbox["Sandbox Driver<br/>(Python REPL)"]
Volume[("πΎ Persistent Volume<br/>/data/<br/>β’ workspaces<br/>β’ artifacts<br/>β’ memory<br/>β’ session state")]
end
CLI --> Agent
API --> Agent
TUI --> Agent
MCP --> Agent
Agent --> History
Agent --> Memory
Agent --> DocCache
Agent --> DocTools
Agent --> RecursiveTools
Agent --> ExecTools
DocTools --> Interpreter
RecursiveTools --> Interpreter
ExecTools --> Interpreter
Interpreter --> Profiles
Interpreter -->|"stdin/stdout<br/>JSON commands"| Sandbox
Sandbox -->|"read/write"| Volume
style entry fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
style orchestration fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
style tools fill:#fff3e0,stroke:#f57c00,stroke-width:2px
style execution fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
style cloud fill:#fce4ec,stroke:#c2185b,stroke-width:2px
Layers:
πͺ Entry Points β π§ Orchestration β π§ Tools β βοΈ Execution β βοΈ Modal Cloud
- Interactive Agent:
RLMReActChatAgent(adspy.Module) combines fast, interactive chat with deep, recursive task execution viarlm_query. - DSPy Aligned: Implements
dspy.RLM,dspy.Module, anddspy.Toolinterfaces β compatible with DSPy optimizers (BootstrapFewShot,MIPROv2). - Secure Sandbox: Code runs in isolated Modal containers with persistent storage volumes, execution profiles, and sensitive data redaction.
- Recursive Delegation: All delegate tools (
rlm_query,analyze_long_document,grounded_answer, etc.) spawn true recursive sub-agents viaspawn_delegate_sub_agent()with unified depth enforcement. - PDF Ingestion: Native document loading via MarkItDown with pypdf fallback; OCR guidance for scanned PDFs.
- Session State: Per-workspace, per-user session persistence with manifests stored on Modal volumes.
- MCP Server: Expose fleet-rlm capabilities as an MCP tool server via
serve-mcp. - Observability: Real-time streaming of thoughts, tool execution, trajectory normalization, and structured logging.
uv pip install fleet-rlmOptional extras for server and MCP support:
uv pip install fleet-rlm[server] # FastAPI server + WebSocket
uv pip install fleet-rlm[mcp] # MCP server
uv pip install fleet-rlm[full] # All extrasSet up your Modal and LLM credentials:
modal setup
modal volume create rlm-volume-dspy
modal secret create LITELLM DSPY_LM_MODEL=openai/gemini-3-pro-preview DSPY_LLM_API_KEY=sk-...Interactive Chat (OpenTUI):
# Requires OpenTUI / Bun
fleet-rlm code-chat --opentuiStandalone Interactive Chat (Ink):
# Prefers Ink UI; falls back to Python UI
fleet
# Force a specific runtime
fleet --ui ink
fleet --ui pythonOne-shot Tasks:
# Basic question
fleet-rlm run-basic --question "What are the first 12 Fibonacci numbers?"
# Document analysis
fleet-rlm run-architecture --docs-path docs/architecture.md --query "Extract all components"Servers:
# API server (FastAPI + WebSocket)
uv run fleet-rlm serve-api --port 8000
# MCP server
fleet-rlm serve-mcp --transport stdiofleet and fleet-rlm code-chat serve different interactive paths:
fleet= standalone bridge chat launcher (Ink preferred, Python fallback)fleet-rlm code-chat= OpenTUI runtime (OpenTUI/Bun required)
# Clone and install
git clone https://github.com/qredence/fleet-rlm.git
cd fleet-rlm
uv sync --extra dev
# With server/MCP support
uv sync --extra dev --extra server --extra mcp
# Build Ink frontend bundle for `fleet --ui ink`
cd tui-ink
npm install
npm run build
npm run test
cd ..
# Copy environment template
cp .env.example .env
# Quality gate
uv run ruff check src tests
uv run ruff format --check src tests
uv run ty check src
uv run pytest -q
# Auto-fix formatting when needed
uv run ruff format src tests- Concepts β Core architecture (Agent, RLM, Sandbox)
- User Flows β Interaction diagrams (Chat, Tools, Delegation)
- Architecture β System components and hierarchy
- Tutorials β Step-by-step lessons
- How-To Guides β Installation, deployment, troubleshooting
- CLI Reference β Full CLI command reference
- HTTP API Reference β Server endpoints and WebSocket protocol
- Source Layout β Package structure guide
We welcome contributions! Please see our Contribution Guide and run the quality gate before submitting:
uv run ruff check src tests
uv run ruff format --check src tests
uv run ty check src
uv run pytest -qMIT License β see LICENSE.
Based on Recursive Language Modeling research by Alex L. Zhang (MIT CSAIL), Omar Khattab (Stanford), and Tim Kraska (MIT).