Skip to content

RoodyCode/rag

Repository files navigation

RAG Knowledge Base

A private document RAG (Retrieval-Augmented Generation) system that ingests PDFs and exposes a search tool via an MCP server. Retrieval combines vector search (pgvector) and BM25 keyword search with cross-encoder reranking, and answers are generated by an AWS Bedrock LLM.

Architecture

flowchart LR
    subgraph Ingestion
        direction TB
        A[📄 data/ PDFs] --> B[Docling<br/>PDF Parser]
        B --> C[HybridChunker<br/>BAAI/bge-m3 tokenizer]
        C --> D[HuggingFace Embeddings<br/>BAAI/bge-m3 · 1024-dim]
        D --> E[(pgvector<br/>PostgreSQL)]
        C --> F[(Redis<br/>BM25 Docstore)]
    end

    subgraph Query["Query  —  mcp_server.py"]
        direction TB
        G[search_knowledge<br/>tool call] --> H[Vector Retriever<br/>pgvector]
        G --> I[BM25 Retriever<br/>Redis]
        H & I --> J[QueryFusionRetriever<br/>relative_score fusion]
        J --> K[Cross-encoder Reranker<br/>BAAI/bge-reranker-large]
        K --> L[BedrockConverse LLM]
        L --> M[Answer + Sources]
    end

    E --> H
    F --> I
Loading

Prerequisites

  • Python 3.11+
  • uv
  • Docker & Docker Compose
  • AWS credentials with Bedrock access

Build and Run

Configuration

ingestion/config.py is the source of truth for supported environment variables and defaults. Copy .env.example to .env and update credentials/endpoints for your environment.

DATABASE_URL=postgresql://chat-app:admin@localhost:5432/chat_app
BEDROCK_API_KEY=<your-aws-bearer-token>
AWS_REGION=eu-central-1
DATA_DIR=./data

Local Build and Run

  1. Install Python dependencies:
uv sync
  1. Start backing services:
docker compose up pgvector redis -d
  1. Enqueue ingestion jobs:
uv run python ingest.py
  1. Start one or more workers:
uv run python worker.py
  1. Query locally (optional):
uv run python ask.py "your question"
  1. Run MCP server:
uv run python mcp_server.py

The server starts on http://localhost:8000 and exposes:

Tool Description
search_knowledge Searches the knowledge base and returns an answer with source file citations

Docker Build and Run

Build images:

docker compose build

Run full stack:

docker compose up -d

Run only ingestion infrastructure + workers:

docker compose up -d pgvector redis worker

Scale worker count:

docker compose up -d --scale worker=3 worker

Run ingestion from inside the worker container (optional):

docker compose exec worker bash
uv run python ingest.py

Maintenance

  • Add dependencies with uv add <package> and commit both pyproject.toml and uv.lock.
  • Rebuild images after dependency changes with docker compose build.
  • When adding or renaming settings, update both ingestion/config.py and .env.example.

Project Structure

.
├── data/                  # PDF documents to ingest
├── ingestion/
│   ├── config.py          # Pydantic settings (loaded from .env)
│   ├── pipeline.py        # Docling parsing, embedding, pgvector + Redis ingestion
│   ├── queue.py           # Redis/RQ enqueueing for ingestion jobs
│   └── tasks.py           # Worker task wrappers around ingestion functions
├── query/
│   └── engine.py          # Hybrid retriever + reranker + Bedrock LLM query engine
├── ingest.py              # Ingestion queue producer entry point
├── worker.py              # RQ worker entry point
├── ask.py                 # Local interactive query CLI
├── mcp_server.py          # FastMCP server exposing search_knowledge tool
├── Dockerfile
└── docker-compose.yml

About

A modular, self-hosted RAG pipeline for building a private, searchable personal knowledge base from PDFs and structured documents.

Topics

Resources

Stars

Watchers

Forks

Contributors