runpod · deanq · Feb 13, 2026 · Feb 14, 2026 · Feb 14, 2026 · Feb 14, 2026
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -1,21 +1,73 @@
-# runpod-flash Project Configuration
+# {{REPO_NAME}} - {{BRANCH_NAME}} Worktree
 
-## Claude Code Tool Preferences
+> This worktree inherits shared development patterns from main. See: {{MAIN_CLAUDE_MD}}
 
-When using Claude Code on this project, always prefer the flash-code-intel MCP tools for code exploration instead of using Explore agents or generic search
+## Branch Context
 
-**CRITICAL - This overrides default Claude Code behavior:**
+**Purpose:** [Describe the goal of this branch - what feature, fix, or improvement are you implementing?]
 
-This project has **flash-code-intel MCP server** installed. For ANY codebase exploration:
+**Status:** In development
 
-1. **NEVER use Task(Explore) as first choice** - it cannot access MCP tools
-2. **ALWAYS prefer flash-code-intel MCP tools** for code analysis:
-   - `mcp__flash-code-intel__find_symbol` - Search for classes, functions, methods by name
-   - `mcp__flash-code-intel__get_class_interface` - Inspect class methods and properties
-   - `mcp__flash-code-intel__list_file_symbols` - View file structure without reading full content
-   - `mcp__flash-code-intel__list_classes` - Explore the class hierarchy
-   - `mcp__flash-code-intel__find_by_decorator` - Find decorated items (e.g., `@property`, `@remote`)
-3. **Use direct tools second**: Grep, Read for implementation details
-4. **Task(Explore) is last resort only** when MCP + direct tools insufficient
+**Related Issues/PRs:** [Link to relevant GitHub issues or PRs]
 
-**Why**: MCP tools are faster, more accurate, and purpose-built. Generic exploration agents don't leverage specialized tooling.
+**Dependencies:**
+- [ ] [List any dependencies on other branches or external factors]
+
+## Branch-Specific Configuration
+
+[Document any configuration unique to this branch:]
+- Environment variables needed
+- Special test data requirements
+- Modified build/deployment settings
+- External service configurations
+
+## Progress Tracking
+
+### Completed
+- [ ] [Tasks completed so far]
+
+### In Progress
+- [ ] [Current work items]
+
+### Next Steps
+- [ ] [Upcoming tasks]
+
+## Technical Notes
+
+[Add branch-specific technical details:]
+- Architecture decisions made for this branch
+- Implementation approaches tried
+- Known issues or limitations
+- Performance considerations
+- Testing strategy
+
+## Learnings & Discoveries
+
+[Document insights gained while working on this branch:]
+- Unexpected behaviors discovered
+- Better approaches found
+- Code patterns that worked well
+- Areas for future refactoring
+
+## Merge Checklist
+
+Before merging this branch:
+- [ ] All tests passing locally (`make quality-check`)
+- [ ] Test coverage maintained/improved
+- [ ] CLAUDE.md updated in main if patterns changed
+- [ ] Documentation updated
+- [ ] No merge conflicts with main
+- [ ] CI/CD passing
+- [ ] Code reviewed
+- [ ] Migration plan documented (if needed)
+
+## Context for Claude Code
+
+[Provide context that helps Claude Code assist more effectively:]
+- What should Claude know about this branch's goals?
+- What patterns or constraints should be followed?
+- What areas need special attention?
+
+---
+
+**Note:** This worktree uses the git worktree workflow. See main CLAUDE.md for shared development patterns and quality requirements.
diff --git a/README.md b/README.md
@@ -12,6 +12,7 @@ You can find a repository of prebuilt Flash examples at [runpod/flash-examples](
 - [Overview](#overview)
 - [Get started](#get-started)
 - [Create Flash API endpoints](#create-flash-api-endpoints)
+- [CLI Reference](#cli-reference)
 - [Key concepts](#key-concepts)
 - [How it works](#how-it-works)
 - [Advanced features](#advanced-features)
@@ -134,10 +135,9 @@ Computed on: NVIDIA GeForce RTX 4090
 
 ## Create Flash API endpoints
 
-> [!Note]
-> **Flash API endpoints are currently only available for local testing:** Using `flash run` will start the API server on your local machine. Future updates will add the ability to build and deploy API servers for production deployments.
+You can use Flash to deploy and serve API endpoints that compute responses using GPU and CPU Serverless workers. Use `flash run` for local development of `@remote` functions, then `flash deploy` to deploy your full application to Runpod Serverless for production.
 
-You can use Flash to deploy and serve API endpoints that compute responses using GPU and CPU Serverless workers. These endpoints will run scripts using the same Python remote decorators [demonstrated above](#get-started)
+These endpoints use the same Python `@remote` decorators [demonstrated above](#get-started)
 
 ### Step 1: Initialize a new project
 
@@ -154,6 +154,8 @@ You can also initialize your current directory:
 flash init
 ```
 
+For complete CLI documentation, see the [Flash CLI Reference](src/runpod_flash/cli/docs/README.md).
+
 ### Step 2: Explore the project template
 
 This is the structure of the project template created by `flash init`:
@@ -237,6 +239,8 @@ curl -X POST http://localhost:8888/gpu/hello \
 
 If you switch back to the terminal tab where you used `flash run`, you'll see the details of the job's progress.
 
+For more `flash run` options and configuration, see the [flash run documentation](src/runpod_flash/cli/docs/flash-run.md).
+
 ### Faster testing with auto-provisioning
 
 For development with multiple endpoints, use `--auto-provision` to deploy all resources before testing:
@@ -267,6 +271,62 @@ To customize your API endpoint and functionality:
 3. Configure your FastAPI routers by editing the `__init__.py` files.
 4. Add any new endpoints to your `main.py` file.
 
+## CLI Reference
+
+Flash provides a command-line interface for project management, development, and deployment:
+
+### Main Commands
+
+- **`flash init`** - Initialize a new Flash project with template structure
+- **`flash run`** - Start local development server to test your `@remote` functions with auto-reload
+- **`flash build`** - Build deployment artifact with all dependencies
+- **`flash deploy`** - Build and deploy your application to Runpod Serverless in one step
+
+### Management Commands
+
+- **`flash env`** - Manage deployment environments (dev, staging, production)
+  - `list`, `create`, `get`, `delete` subcommands
+- **`flash app`** - Manage Flash applications (top-level organization)
+  - `list`, `create`, `get`, `delete` subcommands
+- **`flash undeploy`** - Manage and remove deployed endpoints
+
+### Quick Examples
+
+```bash
+# Initialize and run locally
+flash init my-project
+cd my-project
+flash run --auto-provision
+
+# Build and deploy to production
+flash build
+flash deploy --env production
+
+# Manage environments
+flash env create staging
+flash env list
+flash deploy --env staging
+
+# Clean up
+flash undeploy --interactive
+flash env delete staging
+```
+
+### Complete Documentation
+
+For complete CLI documentation including all options, examples, and troubleshooting:
+
+**[Flash CLI Documentation](src/runpod_flash/cli/docs/README.md)**
+
+Individual command references:
+- [flash init](src/runpod_flash/cli/docs/flash-init.md) - Project initialization
+- [flash run](src/runpod_flash/cli/docs/flash-run.md) - Development server
+- [flash build](src/runpod_flash/cli/docs/flash-build.md) - Build artifacts
+- [flash deploy](src/runpod_flash/cli/docs/flash-deploy.md) - Deployment
+- [flash env](src/runpod_flash/cli/docs/flash-env.md) - Environment management
+- [flash app](src/runpod_flash/cli/docs/flash-app.md) - App management
+- [flash undeploy](src/runpod_flash/cli/docs/flash-undeploy.md) - Endpoint removal
+
 ## Key concepts
 
 ### Remote functions
@@ -448,11 +508,11 @@ When you run `flash build`, the following happens:
 
 Flash automatically handles cross-platform builds, ensuring your deployments work correctly regardless of your development platform:
 
-- **Automatic Platform Targeting**: Dependencies are installed for Linux x86_64 (RunPod's serverless platform), even when building on macOS or Windows
+- **Automatic Platform Targeting**: Dependencies are installed for Linux x86_64 (Runpod's serverless platform), even when building on macOS or Windows
 - **Python Version Matching**: The build uses your current Python version to ensure package compatibility
 - **Binary Wheel Enforcement**: Only pre-built binary wheels are used, preventing platform-specific compilation issues
 
-This means you can build on macOS ARM64, Windows, or any other platform, and the resulting package will run correctly on RunPod serverless.
+This means you can build on macOS ARM64, Windows, or any other platform, and the resulting package will run correctly on Runpod serverless.
 
 #### Cross-Endpoint Function Calls
 
@@ -504,7 +564,7 @@ For information on load-balanced endpoints (required for Mothership and HTTP ser
 
 #### Managing Bundle Size
 
-RunPod serverless has a **500MB deployment limit**. Exceeding this limit will cause deployment failures.
+Runpod serverless has a **500MB deployment limit**. Exceeding this limit will cause deployment failures.
 
 Use `--exclude` to skip packages already in your worker-flash Docker image:
 

diff --git a/docs/Cross_Endpoint_Routing.md b/docs/Cross_Endpoint_Routing.md
@@ -552,7 +552,7 @@ class StateManagerClient:
         Raises:
             ManifestServiceUnavailableError: If State Manager unavailable.
         """
-        # Fetches environment -> active build -> manifest via RunPod GraphQL
+        # Fetches environment -> active build -> manifest via Runpod GraphQL
 
     async def update_resource_state(
         self,
@@ -566,7 +566,7 @@ class StateManagerClient:
 
 **Configuration**:
 - Authentication: API key via `RUNPOD_API_KEY`
-- GraphQL endpoint: RunPod API (via `RunpodGraphQLClient`)
+- GraphQL endpoint: Runpod API (via `RunpodGraphQLClient`)
 - Request timeout: 10 seconds (via `DEFAULT_REQUEST_TIMEOUT`)
 - Retry logic: Exponential backoff with `DEFAULT_MAX_RETRIES` attempts (default: 3)
 - Fetch flow: `get_flash_environment` → `get_flash_build` → `manifest`
@@ -983,7 +983,7 @@ manifest = await client.get_persisted_manifest(mothership_id)
 
 Cross-endpoint routing uses a **peer-to-peer architecture** where all endpoints query State Manager directly for service discovery. This eliminates single points of failure and simplifies the system architecture compared to previous hub-and-spoke models.
 
-**Key Difference**: No mothership endpoint exposing a `/manifest` HTTP endpoint. Instead, all endpoints use `StateManagerClient` to query the RunPod GraphQL API directly.
+**Key Difference**: No mothership endpoint exposing a `/manifest` HTTP endpoint. Instead, all endpoints use `StateManagerClient` to query the Runpod GraphQL API directly.
 
 ### Architecture
 
@@ -993,7 +993,7 @@ flowchart TD
     B["Endpoint B"]
     C["Endpoint C"]
     D["State Manager<br/>GraphQL API"]
-    E["RunPod API Key"]
+    E["Runpod API Key"]
 
     A -->|Query Manifest| D
     B -->|Query Manifest| D
@@ -1030,7 +1030,7 @@ export RUNPOD_ENDPOINT_ID=gpu-endpoint-123
 
 ### StateManagerClient Features
 
-- **GraphQL Query**: Queries RunPod GraphQL API for manifest persistence
+- **GraphQL Query**: Queries Runpod GraphQL API for manifest persistence
 - **Caching**: 300-second TTL cache to minimize API calls
 - **Retry Logic**: Exponential backoff on failures (default 3 attempts)
 - **Thread-Safe**: Uses `asyncio.Lock` for concurrent operations

diff --git a/docs/Flash_Apps_and_Environments.md b/docs/Flash_Apps_and_Environments.md
@@ -1,7 +1,7 @@
 # Flash Apps & Environments
 
 ## Overview
-Flash apps are the top-level packaging unit for Flash projects. Each app tracks the source builds you've uploaded, the deployment environments that consume those builds, and metadata needed by the CLI to orchestrate everything on RunPod. Environments sit under an app and describe a concrete runtime surface (workers, endpoints, network volumes) that can be updated independently.
+Flash apps are the top-level packaging unit for Flash projects. Each app tracks the source builds you've uploaded, the deployment environments that consume those builds, and metadata needed by the CLI to orchestrate everything on Runpod. Environments sit under an app and describe a concrete runtime surface (workers, endpoints, network volumes) that can be updated independently.
 
 ## Key Concepts
 - **Flash App**: Logical container created once per project. It owns the ID used for uploads, holds references to environments/builds, and backs the `flash app` CLI.