Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 67 additions & 15 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,73 @@
# runpod-flash Project Configuration
# {{REPO_NAME}} - {{BRANCH_NAME}} Worktree

## Claude Code Tool Preferences
> This worktree inherits shared development patterns from main. See: {{MAIN_CLAUDE_MD}}

When using Claude Code on this project, always prefer the flash-code-intel MCP tools for code exploration instead of using Explore agents or generic search
## Branch Context

**CRITICAL - This overrides default Claude Code behavior:**
**Purpose:** [Describe the goal of this branch - what feature, fix, or improvement are you implementing?]

This project has **flash-code-intel MCP server** installed. For ANY codebase exploration:
**Status:** In development

1. **NEVER use Task(Explore) as first choice** - it cannot access MCP tools
2. **ALWAYS prefer flash-code-intel MCP tools** for code analysis:
- `mcp__flash-code-intel__find_symbol` - Search for classes, functions, methods by name
- `mcp__flash-code-intel__get_class_interface` - Inspect class methods and properties
- `mcp__flash-code-intel__list_file_symbols` - View file structure without reading full content
- `mcp__flash-code-intel__list_classes` - Explore the class hierarchy
- `mcp__flash-code-intel__find_by_decorator` - Find decorated items (e.g., `@property`, `@remote`)
3. **Use direct tools second**: Grep, Read for implementation details
4. **Task(Explore) is last resort only** when MCP + direct tools insufficient
**Related Issues/PRs:** [Link to relevant GitHub issues or PRs]

**Why**: MCP tools are faster, more accurate, and purpose-built. Generic exploration agents don't leverage specialized tooling.
**Dependencies:**
- [ ] [List any dependencies on other branches or external factors]

## Branch-Specific Configuration

[Document any configuration unique to this branch:]
- Environment variables needed
- Special test data requirements
- Modified build/deployment settings
- External service configurations

## Progress Tracking

### Completed
- [ ] [Tasks completed so far]

### In Progress
- [ ] [Current work items]

### Next Steps
- [ ] [Upcoming tasks]

## Technical Notes

[Add branch-specific technical details:]
- Architecture decisions made for this branch
- Implementation approaches tried
- Known issues or limitations
- Performance considerations
- Testing strategy

## Learnings & Discoveries

[Document insights gained while working on this branch:]
- Unexpected behaviors discovered
- Better approaches found
- Code patterns that worked well
- Areas for future refactoring

## Merge Checklist

Before merging this branch:
- [ ] All tests passing locally (`make quality-check`)
- [ ] Test coverage maintained/improved
- [ ] CLAUDE.md updated in main if patterns changed
- [ ] Documentation updated
- [ ] No merge conflicts with main
- [ ] CI/CD passing
- [ ] Code reviewed
- [ ] Migration plan documented (if needed)

## Context for Claude Code

[Provide context that helps Claude Code assist more effectively:]
- What should Claude know about this branch's goals?
- What patterns or constraints should be followed?
- What areas need special attention?

---

**Note:** This worktree uses the git worktree workflow. See main CLAUDE.md for shared development patterns and quality requirements.
72 changes: 66 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ You can find a repository of prebuilt Flash examples at [runpod/flash-examples](
- [Overview](#overview)
- [Get started](#get-started)
- [Create Flash API endpoints](#create-flash-api-endpoints)
- [CLI Reference](#cli-reference)
- [Key concepts](#key-concepts)
- [How it works](#how-it-works)
- [Advanced features](#advanced-features)
Expand Down Expand Up @@ -134,10 +135,9 @@ Computed on: NVIDIA GeForce RTX 4090

## Create Flash API endpoints

> [!Note]
> **Flash API endpoints are currently only available for local testing:** Using `flash run` will start the API server on your local machine. Future updates will add the ability to build and deploy API servers for production deployments.
You can use Flash to deploy and serve API endpoints that compute responses using GPU and CPU Serverless workers. Use `flash run` for local development of `@remote` functions, then `flash deploy` to deploy your full application to Runpod Serverless for production.

You can use Flash to deploy and serve API endpoints that compute responses using GPU and CPU Serverless workers. These endpoints will run scripts using the same Python remote decorators [demonstrated above](#get-started)
These endpoints use the same Python `@remote` decorators [demonstrated above](#get-started)

### Step 1: Initialize a new project

Expand All @@ -154,6 +154,8 @@ You can also initialize your current directory:
flash init
```

For complete CLI documentation, see the [Flash CLI Reference](src/runpod_flash/cli/docs/README.md).

### Step 2: Explore the project template

This is the structure of the project template created by `flash init`:
Expand Down Expand Up @@ -237,6 +239,8 @@ curl -X POST http://localhost:8888/gpu/hello \

If you switch back to the terminal tab where you used `flash run`, you'll see the details of the job's progress.

For more `flash run` options and configuration, see the [flash run documentation](src/runpod_flash/cli/docs/flash-run.md).

### Faster testing with auto-provisioning

For development with multiple endpoints, use `--auto-provision` to deploy all resources before testing:
Expand Down Expand Up @@ -267,6 +271,62 @@ To customize your API endpoint and functionality:
3. Configure your FastAPI routers by editing the `__init__.py` files.
4. Add any new endpoints to your `main.py` file.

## CLI Reference

Flash provides a command-line interface for project management, development, and deployment:

### Main Commands

- **`flash init`** - Initialize a new Flash project with template structure
- **`flash run`** - Start local development server to test your `@remote` functions with auto-reload
- **`flash build`** - Build deployment artifact with all dependencies
- **`flash deploy`** - Build and deploy your application to Runpod Serverless in one step

### Management Commands

- **`flash env`** - Manage deployment environments (dev, staging, production)
- `list`, `create`, `get`, `delete` subcommands
- **`flash app`** - Manage Flash applications (top-level organization)
- `list`, `create`, `get`, `delete` subcommands
- **`flash undeploy`** - Manage and remove deployed endpoints

### Quick Examples

```bash
# Initialize and run locally
flash init my-project
cd my-project
flash run --auto-provision

# Build and deploy to production
flash build
flash deploy --env production

# Manage environments
flash env create staging
flash env list
flash deploy --env staging

# Clean up
flash undeploy --interactive
flash env delete staging
```

### Complete Documentation

For complete CLI documentation including all options, examples, and troubleshooting:

**[Flash CLI Documentation](src/runpod_flash/cli/docs/README.md)**

Individual command references:
- [flash init](src/runpod_flash/cli/docs/flash-init.md) - Project initialization
- [flash run](src/runpod_flash/cli/docs/flash-run.md) - Development server
- [flash build](src/runpod_flash/cli/docs/flash-build.md) - Build artifacts
- [flash deploy](src/runpod_flash/cli/docs/flash-deploy.md) - Deployment
- [flash env](src/runpod_flash/cli/docs/flash-env.md) - Environment management
- [flash app](src/runpod_flash/cli/docs/flash-app.md) - App management
- [flash undeploy](src/runpod_flash/cli/docs/flash-undeploy.md) - Endpoint removal

## Key concepts

### Remote functions
Expand Down Expand Up @@ -448,11 +508,11 @@ When you run `flash build`, the following happens:

Flash automatically handles cross-platform builds, ensuring your deployments work correctly regardless of your development platform:

- **Automatic Platform Targeting**: Dependencies are installed for Linux x86_64 (RunPod's serverless platform), even when building on macOS or Windows
- **Automatic Platform Targeting**: Dependencies are installed for Linux x86_64 (Runpod's serverless platform), even when building on macOS or Windows
- **Python Version Matching**: The build uses your current Python version to ensure package compatibility
- **Binary Wheel Enforcement**: Only pre-built binary wheels are used, preventing platform-specific compilation issues

This means you can build on macOS ARM64, Windows, or any other platform, and the resulting package will run correctly on RunPod serverless.
This means you can build on macOS ARM64, Windows, or any other platform, and the resulting package will run correctly on Runpod serverless.

#### Cross-Endpoint Function Calls

Expand Down Expand Up @@ -504,7 +564,7 @@ For information on load-balanced endpoints (required for Mothership and HTTP ser

#### Managing Bundle Size

RunPod serverless has a **500MB deployment limit**. Exceeding this limit will cause deployment failures.
Runpod serverless has a **500MB deployment limit**. Exceeding this limit will cause deployment failures.

Use `--exclude` to skip packages already in your worker-flash Docker image:

Expand Down
10 changes: 5 additions & 5 deletions docs/Cross_Endpoint_Routing.md
Original file line number Diff line number Diff line change
Expand Up @@ -552,7 +552,7 @@ class StateManagerClient:
Raises:
ManifestServiceUnavailableError: If State Manager unavailable.
"""
# Fetches environment -> active build -> manifest via RunPod GraphQL
# Fetches environment -> active build -> manifest via Runpod GraphQL

async def update_resource_state(
self,
Expand All @@ -566,7 +566,7 @@ class StateManagerClient:

**Configuration**:
- Authentication: API key via `RUNPOD_API_KEY`
- GraphQL endpoint: RunPod API (via `RunpodGraphQLClient`)
- GraphQL endpoint: Runpod API (via `RunpodGraphQLClient`)
- Request timeout: 10 seconds (via `DEFAULT_REQUEST_TIMEOUT`)
- Retry logic: Exponential backoff with `DEFAULT_MAX_RETRIES` attempts (default: 3)
- Fetch flow: `get_flash_environment` → `get_flash_build` → `manifest`
Expand Down Expand Up @@ -983,7 +983,7 @@ manifest = await client.get_persisted_manifest(mothership_id)

Cross-endpoint routing uses a **peer-to-peer architecture** where all endpoints query State Manager directly for service discovery. This eliminates single points of failure and simplifies the system architecture compared to previous hub-and-spoke models.

**Key Difference**: No mothership endpoint exposing a `/manifest` HTTP endpoint. Instead, all endpoints use `StateManagerClient` to query the RunPod GraphQL API directly.
**Key Difference**: No mothership endpoint exposing a `/manifest` HTTP endpoint. Instead, all endpoints use `StateManagerClient` to query the Runpod GraphQL API directly.

### Architecture

Expand All @@ -993,7 +993,7 @@ flowchart TD
B["Endpoint B"]
C["Endpoint C"]
D["State Manager<br/>GraphQL API"]
E["RunPod API Key"]
E["Runpod API Key"]

A -->|Query Manifest| D
B -->|Query Manifest| D
Expand Down Expand Up @@ -1030,7 +1030,7 @@ export RUNPOD_ENDPOINT_ID=gpu-endpoint-123

### StateManagerClient Features

- **GraphQL Query**: Queries RunPod GraphQL API for manifest persistence
- **GraphQL Query**: Queries Runpod GraphQL API for manifest persistence
- **Caching**: 300-second TTL cache to minimize API calls
- **Retry Logic**: Exponential backoff on failures (default 3 attempts)
- **Thread-Safe**: Uses `asyncio.Lock` for concurrent operations
Expand Down
2 changes: 1 addition & 1 deletion docs/Flash_Apps_and_Environments.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Flash Apps & Environments

## Overview
Flash apps are the top-level packaging unit for Flash projects. Each app tracks the source builds you've uploaded, the deployment environments that consume those builds, and metadata needed by the CLI to orchestrate everything on RunPod. Environments sit under an app and describe a concrete runtime surface (workers, endpoints, network volumes) that can be updated independently.
Flash apps are the top-level packaging unit for Flash projects. Each app tracks the source builds you've uploaded, the deployment environments that consume those builds, and metadata needed by the CLI to orchestrate everything on Runpod. Environments sit under an app and describe a concrete runtime surface (workers, endpoints, network volumes) that can be updated independently.

## Key Concepts
- **Flash App**: Logical container created once per project. It owns the ID used for uploads, holds references to environments/builds, and backs the `flash app` CLI.
Expand Down
Loading
Loading