RayAgent

LLM-powered agent for building and validating multimodal infrastructure using Ray.

RayAgent is an intelligent orchestrator that captures user intent, analyzes sample data, and automatically scaffolds infrastructure for multimodal AI applications. It builds buckets, feature extractors, taxonomies, retrievers, and test queries—entirely agentically, using Ray for distributed execution and job orchestration.

                 ┌───────────────────────────┐
                 │      User Input (CLI/UI)  │
                 │ - Sample files            │
                 │ - Example queries         │
                 │ - Success criteria        │
                 └────────────┬──────────────┘
                              │
                              ▼
                 ┌───────────────────────────┐
                 │      LLM Planning Agent    │
                 │  - Chooses next action     │
                 │  - Builds tool call plans  │
                 └────────────┬──────────────┘
                              │
            ┌─────────────────▼─────────────────┐
            │    Ray Job Submission (runner)    │
            │  - Submits entire agent loop      │
            │  - Executes tool actions via Ray  │
            └─────────────────┬─────────────────┘
                              │
    ┌─────────────────────────┴──────────────────────────┐
    │              Mixpeek Infrastructure API            │
    │ - /buckets           - /collections                │
    │ - /extractors        - /retrievers                 │
    │ - /taxonomies        - /validators                 │
    └─────────────────────────┬──────────────────────────┘
                              │
                              ▼
                ┌──────────────────────────────┐
                │  Streaming + Logging Module   │
                │  - Redis pubsub / stdout      │
                │  - Optional WebSocket UI      │
                └────────────┬──────────────────┘
                             │
                             ▼
                 ┌───────────────────────────┐
                 │     Real-Time Feedback     │
                 │  - Logs of agent steps     │
                 │  - Success/failure markers │
                 │  - Suggestions             │
                 └───────────────────────────┘

✨ Features

🔁 Agentic Planning Loop – Powered by an LLM that recursively calls your infrastructure APIs.
🧠 Intent-to-Infrastructure – Converts user goals and sample queries into fully configured Mixpeek (or custom) resources.
🧰 Supports Multimodal Inputs – Works with video, image, audio, PDFs, and structured metadata.
🚀 Ray-Native Execution – Each step (create, test, enrich, evaluate) runs as a task inside a Ray Job.
📡 Streaming Logs – All API calls, results, and failures are streamed back to the client in real time.
✅ Validation Suite – Asserts search quality, enrichment coverage, latency, and taxonomy performance.
🔌 Pluggable Backends – Built for Mixpeek, but easily extendable to other infra APIs.

📦 Use Cases

Auto-generate infrastructure for RAG or video understanding pipelines
Test retriever quality on real user queries
Build taxonomies from scratch based on uploaded data
Evaluate enrichment coverage and clustering utility
Serve as an agent sandbox template for infrastructure code execution

📁 Project Structure


rayagent/
├── agent/                # LLM planner + tool definitions
│   ├── planner.py
│   └── tools.py
├── executor/             # Ray job logic + task execution
│   ├── runner.py
│   └── logger.py
├── validators/           # Infra success criteria tests
│   ├── recall.py
│   └── enrichment.py
├── ui/                   # Optional WebSocket or Streamlit interface
├── main.py               # CLI entrypoint
└── config.yaml           # Infra API config + prompt templates

🚀 Quickstart

1. Install dependencies

git clone https://github.com/your-org/RayAgent.git
cd RayAgent
pip install -r requirements.txt

2. Configure

Edit config.yaml to point to your infrastructure APIs (e.g. Mixpeek SDK or local dev server).

mixpeek:
  api_base: "http://localhost:8000"
  api_key: "YOUR_KEY"
llm:
  provider: "openai"
  model: "gpt-4o"

3. Run Agent

python main.py init --files path/to/data --queries path/to/intents.json

Or submit as a Ray Job:

ray job submit --working-dir . -- python main.py init --files ./sample/ --queries ./intents.json

🧠 Agent Planning

The agent uses an LLM to:

Inspect file types and metadata
Parse example queries
Plan infrastructure actions (e.g., create bucket, add extractor, validate retriever)
Recursively execute actions via Ray tasks
Stream results to the frontend or CLI

All actions are tool calls backed by your API (e.g. POST /buckets, POST /retrievers, etc.)

📊 Validation Reports

After setup, the agent runs assertions on:

Query match rate – % of sample queries returning relevant docs
Taxonomy coverage – % of content tagged by labels
Clustering coherence – Number and entropy of clusters
Latency – Cold/warm search timing

Results are output as a markdown or JSON report:

{
  "retriever_success_rate": 0.88,
  "taxonomy_coverage": 92.3,
  "average_latency_ms": 712,
  "suggestions": [
    "Add a fallback text retriever for PDF-only docs",
    "Cluster entropy > 1.5 – consider pruning noisy labels"
  ]
}

📡 Streaming Output

RayAgent streams execution logs using:

stdout (CLI)
Redis PubSub or WebSocket (UI integration)
Optional: Ray Dashboard Task Logs

You’ll see output like:

[Agent] ✅ Created bucket: video-inputs
[Agent] ✅ Added CLIP extractor
[Agent] ❌ Retriever returned 0 results for: 'Find people walking in snow'
[Agent] ✅ Added reranker
[Validator] ✅ 91.5% of queries returned relevant documents

🧪 Example Usage

python main.py init \
  --files ./sample-data \
  --queries ./example-queries.json \
  --goal "Build a retriever that supports reverse video search and tags all PDFs with relevant clusters"

🔌 Extending

You can add your own tools:

@tool(name="CreateCustomIndex")
def create_index(name: str, config: dict) -> dict:
    return requests.post(f"{api_base}/indexes", json={...}).json()

Then register it with the planner.

🤝 Contributing

PRs welcome! Please open issues or feature requests before submitting large changes.

📄 License

MIT

🧭 Roadmap

Add Hugging Face agent mode
Streamlit-based demo frontend
Auto-doc feedback loop
Plugin system for other vector DBs (Qdrant, Pinecone, etc.)

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
rayagent-header.png		rayagent-header.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RayAgent

✨ Features

📦 Use Cases

📁 Project Structure

🚀 Quickstart

1. Install dependencies

2. Configure

3. Run Agent

🧠 Agent Planning

📊 Validation Reports

📡 Streaming Output

🧪 Example Usage

🔌 Extending

🤝 Contributing

📄 License

🧭 Roadmap

About

Uh oh!

Releases

Packages

mixpeek/RayAgent

Folders and files

Latest commit

History

Repository files navigation

RayAgent

✨ Features

📦 Use Cases

📁 Project Structure

🚀 Quickstart

1. Install dependencies

2. Configure

3. Run Agent

🧠 Agent Planning

📊 Validation Reports

📡 Streaming Output

🧪 Example Usage

🔌 Extending

🤝 Contributing

📄 License

🧭 Roadmap

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages