Harsh Bhanushali harshbhanushali26

About Me

"I've shipped enough broken AI projects to know — the architecture meeting you skip always becomes the bug you can't find."

I'm an Agentic AI Developer from Gujarat, India. I build autonomous AI systems — RAG pipelines, multi-agent graphs, LLM-backed CLIs — and the one thing every project I've shipped has in common is that I planned the structure before writing a single line of code.

That habit didn't come from a course. It came from building things without it and feeling exactly what breaks and why. Now every project starts with a clear execution layer, defined state boundaries, and deliberate tradeoff decisions — before a file even gets created.

I work across the full AI engineering stack: from protocol-level MCP tool servers and LangGraph stateful graphs, to FastAPI backends and Streamlit interfaces that real users can actually interact with.

Currently building Capsule — a personal content manager with a Telegram bot frontend, FastAPI backend, Groq-powered processing, and hybrid SQLite + ChromaDB storage.

📍 Open to AI Engineer roles at AI-first startups — Mumbai, Pune, Bangalore, Hyderabad, or remote.

🏗️ Featured Projects

🔬 Multi-Agent Research Pipeline | 🔗 Repository

Autonomous CLI system executing structured, stateful research workflows across 6 specialized AI agents.

Hierarchical Orchestration: Designed a full supervisor-driven agent hierarchy — Supervisor → Search → Scrape → Summarize → Critique → Synthesize — with explicit state boundaries between each role to prevent context bleed across agents.
Quota-Aware Model Routing: Built dynamic fallback logic using Groq (llama-3.3-70b) as primary and Gemini (gemini-2.0-flash) as fallback tier — agents switch models mid-pipeline without interrupting the workflow.
Human-in-the-Loop Checkpoints: Engineered interactive review gates at critical pipeline stages with a clean Rich terminal UI, keeping a human in control without breaking the execution flow.
Stack: Python, Groq, Gemini, DuckDuckGo, Rich CLI

🧠 hArI — RAG Document Intelligence | 🔗 Repository

Production-deployed web app for grounded, hallucination-resistant document interrogation.

Precision Retrieval: Switched ChromaDB distance metric from L2 to cosine and enforced a strict SCORE_THRESHOLD=0.35 — completely eliminated hallucinated source citations without touching the LLM layer.
Intent-Based Routing: Built a local intent classifier that decides query path before hitting the vector store — Conversational → Vector RAG → LLM fallback — reducing unnecessary embedding lookups.
Clean Stream Output: Wrote a custom strip_thinking() post-processor to scrub raw LLM reasoning tokens before they reach the UI, keeping responses clean without modifying the model behavior.
Stack: Groq (llama-4-scout-17b), ChromaDB, SentenceTransformer (all-MiniLM-L6-v2), PyMuPDF, Streamlit, uv

🤖 AI Agent Engine | 🔗 Repository

A 4-layer autonomous agent pipeline built natively in Python — architected before a single file was created.

Zero-LLM Routing Layer: Designed a deterministic cache at the top of the pipeline that resolves ~80% of routine queries with 0 LLM API calls — speed and cost handled at the architecture level, not prompt level.
Sub-50ms Semantic Search: Integrated ChromaDB + SentenceTransformer maintaining ~30ms semantic search latency as the second routing layer before any external API call is made.
Strict Execution Economics: Planner → Validator → Executor pipeline with hard quota enforcement keeps per-session cost at ~$0.0005 — a constraint that was designed in, not optimized in later.
Stack: Python 3.11+, Gemini API, ChromaDB, SentenceTransformer, DuckDuckGo, Open-Meteo

📦 Other Projects

Project	What it does	Stack
🔌 DevMind — MCP Server	Local Model Context Protocol tool server giving LLMs secure, HITL-gated access to the file system — read, write, execute Python snippets, format JSON, count tokens	Python, MCP SDK, tiktoken
🗄️ LangGraph SQL Runner	Multi-question parallel SQL execution using LangGraph's Send API for dynamic fan-out across schema analysis and execution nodes, with inline HITL review before any query fires	LangGraph, Groq, SQLite, Pydantic
💰 Finance Agent CLI	Terminal-based personal finance assistant — 8 natural language commands, zero cloud retention, all transaction data stays local	Python, Groq, JSON, CLI
🎯 NextSteps	Resume-to-JD gap analyzer — parses unstructured resumes against job descriptions or URLs and outputs skill mapping + actionable roadmap	Python, Groq, Tavily

⚙️ Technical Stack

🤖 AI & Agent Systems

🗄️ Vector Storage & Embeddings

🐍 Languages & Backend

🛠️ Tools & Interfaces

🔨 Currently Building

Capsule — A personal saved content manager built the right way: schema first, then logic, then interface.

Frontend: Telegram bot + browser extension for saving content from anywhere
Backend: FastAPI with async endpoints and Pydantic-validated request/response models
Processing: Groq for content summarization and tagging at save-time
Storage: SQLite for structured metadata + ChromaDB for semantic search across saved content

Building this because every content manager I tried either had no AI or had AI bolted on. This one is designed around it.

📊 GitHub Stats

🌐 Let's Connect

I'm actively looking for AI Engineer roles at startups where AI is the product, not a feature. If that's you — or you know someone building that — reach out.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly