642a-d3d4-4418-a2f7-124fefe0a4a2.mp4
- PDF based document ingestion
- Semantic search using dense vector embeddings
- FAISS powered similarity retrieval
- Context grounded answer generation
- LangGraph based retrieval-generation workflow
- Streamlit frontend for interaction
- Modular backend architecture for easy extension
| Component | Technology |
|---|---|
| Language | Python |
| LLM Orchestration | LangChain |
| Workflow Graph | LangGraph |
| Vector Database | FAISS |
| Embeddings | Hugging Face Embeddings |
| LLM Provider | Hugging Face Inference API |
| PDF Parsing | PyPDFLoader |
| Text Chunking | RecursiveCharacterTextSplitter |
| Frontend | Streamlit |
| Backend API | FastAPI |
git clone <your-repository-url>
cd <repository-name>python -m venv .venv.venv\Scripts\activatesource .venv/bin/activatepip install -r requirements.txtCreate a .env file in the project root:
HUGGINGFACEHUB_API_TOKEN=your_token_herePlace all PDF files inside:
research/
uvicorn rag_chain:app --reloadBackend runs at:
http://127.0.0.1:8000
streamlit run app.py├── app.py # Streamlit frontend
├── rag_chain.py # RAG pipeline and FastAPI backend
├── requirements.txt # Project dependencies
├── .env # Hugging Face API token
├── research/ # PDF document corpus
│ ├── paper1.pdf
│ ├── paper2.pdf
│ └── ...