RAG System with Single-Node & Distributed Architectures

A production-ready Retrieval-Augmented Generation (RAG) system with support for both single-node (development) and distributed (production) deployments.

🌟 Features

Core RAG Capabilities

Retrieval-Augmented Generation: Context-aware answers using PDF content indexed in Qdrant
LangGraph Orchestration: State machine-based conversational flow
Cross-Encoder Reranking: Improved retrieval relevance
Redis Caching: 5-minute TTL for query results
Answer Validation Loop: Automatic relevance checking with retry logic

Web Interface

4-Client Demo: Concurrent WebSocket-based query processing
Real-time Monitoring: Live metrics and query logging
Cache Indicators: Visual cache hit/miss status
Streaming Responses: Word-by-word response delivery

Architecture Modes

Single-Node Mode: Lightweight setup for local development
Distributed Mode: Multi-node clusters for production scaling
- 3-node Qdrant cluster with sharding
- 3-node MongoDB replica set
- 6-node Redis cluster
- 3-node RabbitMQ cluster

🚀 Quick Start

Single-Node Mode (Development)

# Start infrastructure
docker-compose up -d

# Install dependencies
pip install -e .

# Index PDF
python embed.py

# Start server
RAG_MODE=single uvicorn src.api.main:app --port 8000

Distributed Mode (Production-like)

# Start distributed cluster
docker-compose -f docker-compose.distributed.yml up -d

# Index to cluster
python embed_distributed.py

# Start server
RAG_MODE=distributed uvicorn src.api.main:app --port 8000

📖 For detailed instructions, see DEMO_GUIDE.md

📋 Requirements

Python 3.10+
Docker and Docker Compose
OpenAI API key in .env:
```
OPENAI_API_KEY=your-key-here
```

🌐 Access Points

Service	Single Mode	Distributed Mode
Demo UI	http://localhost:8000	http://localhost:8000
Qdrant	:6333	:6333, :6336, :6338
Monitoring	:3000 (Grafana)	:3000 (Grafana)

📁 Project Structure

Project-RAG/
├── src/
│   ├── api/main.py              # FastAPI server
│   ├── core/
│   │   ├── config.py            # Mode configuration
│   │   ├── engine_factory.py    # Engine factory
│   │   ├── rag_engine.py        # Single-node engine
│   │   └── rag_engine_distributed.py
│   └── ui/                      # Web interface
├── cluster_clients.py           # Distributed clients
├── qdrant_sharding.py          # Sharding strategies
├── embed.py                    # Single-node embedding
├── embed_distributed.py        # Distributed embedding
├── docker-compose.yml          # Single-node setup
├── docker-compose.distributed.yml
└── DEMO_GUIDE.md               # Complete documentation

🎯 Usage

Web Demo

Start the server (see Quick Start)
Visit http://localhost:8000
Use the 4-client interface to test concurrent queries
Monitor metrics at http://localhost:8000/monitor

Command Line

python main.py

Check Current Mode

curl http://localhost:8000/mode

🔧 Configuration

Switch modes via RAG_MODE environment variable:

RAG_MODE=single - Single-node mode (default)
RAG_MODE=distributed - Distributed cluster mode

📚 Documentation

DEMO_GUIDE.md - Complete setup and usage guide
pyproject.toml - Project dependencies

🤝 Development

Use single-node mode for rapid development
Test features locally with the demo UI
Switch to distributed mode for scaling tests
Use helper scripts in scripts/ for environment management

Ready to start? See DEMO_GUIDE.md for detailed instructions.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
config		config
scripts		scripts
src		src
.gitignore		.gitignore
.python-version		.python-version
PROJECT_JOURNEY.md		PROJECT_JOURNEY.md
README.md		README.md
cluster_clients.py		cluster_clients.py
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.distributed.yml		docker-compose.distributed.yml
docker-compose.yml		docker-compose.yml
embed.py		embed.py
embed_distributed.py		embed_distributed.py
hybrid_retriever.py		hybrid_retriever.py
main.py		main.py
nodejs.pdf		nodejs.pdf
pyproject.toml		pyproject.toml
qdrant_sharding.py		qdrant_sharding.py
query_enhancer.py		query_enhancer.py
queue_service.py		queue_service.py
reranker.py		reranker.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG System with Single-Node & Distributed Architectures

🌟 Features

Core RAG Capabilities

Web Interface

Architecture Modes

🚀 Quick Start

Single-Node Mode (Development)

Distributed Mode (Production-like)

📋 Requirements

🌐 Access Points

📁 Project Structure

🎯 Usage

Web Demo

Command Line

Check Current Mode

🔧 Configuration

📚 Documentation

🤝 Development

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RAG System with Single-Node & Distributed Architectures

🌟 Features

Core RAG Capabilities

Web Interface

Architecture Modes

🚀 Quick Start

Single-Node Mode (Development)

Distributed Mode (Production-like)

📋 Requirements

🌐 Access Points

📁 Project Structure

🎯 Usage

Web Demo

Command Line

Check Current Mode

🔧 Configuration

📚 Documentation

🤝 Development

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages