A Retrieval Augmented Generation chatbot with a FastAPI/Chroma backend and elegant web UI.
Use this order:
- Complete local setup (
Setupsection). - Configure production env values (
Local vs Productionsection). - Follow
Production deployment flow (nginx)end-to-end. - Use
Reverse proxy alternativesif you choose Caddy instead of nginx.
- Create a virtual environment and install dependencies:
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt- Create your environment file:
cp .env.example .envEdit .env and set OPENAI_API_KEY.
If you want endpoint protection enabled, generate an app key and set APP_API_KEY in .env:
openssl rand -base64 32Then copy that value into:
APP_API_KEY=YOUR_GENERATED_VALUEOptional but recommended for any internet-facing deployment:
APP_API_KEY: shared API key required for/chat,/ingest, and/documents*.MAX_UPLOAD_MB: maximum upload size for document uploads (default:2).RATE_LIMIT_PER_MIN: per-client per-route request limit per minute (default:10).RETRIEVER_K: number of chunks returned to the LLM (default:12).RETRIEVER_SEARCH_TYPE: retrieval mode (mmrorsimilarity, default:mmr).RETRIEVER_FETCH_K: candidate pool size used by MMR (default:24).RETRIEVER_LAMBDA_MULT: MMR diversity/score balance (default:0.35).TOKENIZERS_PARALLELISM: set tofalseto reduce local multiprocessing/tokenizer warning noise.DEBUG_RAG: set totrueto include retrieval diagnostics in/chatJSON responses.
These defaults are intentionally conservative for limited-access demos. For broader production traffic, tune limits based on expected load, abuse risk, and budget.
-
Add documents to
./data. -
Start the server:
python3 -m uvicorn app.main:app --reloadOpen http://127.0.0.1:8000 in your browser.
If APP_API_KEY is set, enter it in the in-page App API key field and click Save key. The key panel hides after a successful authenticated request and reappears if the server returns 401.
Before pushing this project to GitHub:
- Keep
.envout of version control (it contains real secrets). - Commit only
.env.examplewith placeholder values. - Confirm
.gitignoreincludes.envand.env.*plus!.env.example. - Rotate any API keys that were ever shared, pasted in chats, or committed by mistake.
- If a secret was committed, remove it from git history and rotate it immediately.
Use the "Rebuild index" button in the UI or run:
curl -X POST http://127.0.0.1:8000/ingest \
-H "x-api-key: YOUR_APP_API_KEY"If APP_API_KEY is not set, the x-api-key header is not required.
Use the same code in both places and change only runtime/infrastructure settings.
- Keep the app directly accessible at
127.0.0.1:8000. - Use auto-reload for fast iteration.
python3 -m uvicorn app.main:app --reload- Run the app on localhost only (
127.0.0.1:8000). - Put nginx in front for HTTPS and public access (
80/443). - Do not use
--reload.
python3 -m uvicorn app.main:app --host 127.0.0.1 --port 8000Recommended production env values in .env:
- strong
APP_API_KEY - tuned
RATE_LIMIT_PER_MIN - conservative
MAX_UPLOAD_MB
Use this sequence on a Linux server.
cp .env.example .envSet at least:
OPENAI_API_KEYAPP_API_KEY(strong random value)MAX_UPLOAD_MBRATE_LIMIT_PER_MIN
Sample unit file is included at deploy/doc-chat.service.
sudo cp deploy/doc-chat.service /etc/systemd/system/doc-chat.service
sudo systemctl daemon-reload
sudo systemctl enable --now doc-chat
sudo systemctl status doc-chatUse this if you want to customize service fields (paths/user may differ):
[Unit]
Description=Doc Chat FastAPI service
After=network.target
[Service]
User=www-data
WorkingDirectory=/opt/doc_chat
EnvironmentFile=/opt/doc_chat/.env
ExecStart=/opt/doc_chat/.venv/bin/uvicorn app.main:app --host 127.0.0.1 --port 8000
Restart=always
RestartSec=3
[Install]
WantedBy=multi-user.targetAfter customizing the unit file, run the same systemctl daemon-reload and service enable/start commands shown above.
Sample config is included at deploy/nginx.conf.
- Replace
example.comand certificate paths indeploy/nginx.conf. - Install/load the config in nginx.
- Test and reload nginx:
sudo nginx -t
sudo systemctl reload nginx- Visit
https://your-domainand confirm chat UI loads. - Confirm protected endpoints reject missing key (
401). - Confirm normal traffic works with valid key from the UI.
- Check logs for startup/requests:
sudo systemctl status doc-chat
sudo journalctl -u doc-chat -n 100 --no-pagerBoth proxy samples terminate HTTPS, forward to 127.0.0.1:8000, and enforce upload limits.
deploy/nginx.confdeploy/Caddyfile
Use this runbook for a clean ~3 minute walkthrough.
source .venv/bin/activate
python3 -m uvicorn app.main:app --reloadOpen http://127.0.0.1:8000.
- Enter API key in App API key and click Save key.
- Point out:
No documents found.- Ask disabled
- Rebuild index disabled
- Upload one small PDF (
< 2 MB). - Point out status:
Upload complete.thenIndex rebuilt.
Who are Cornelius and Voltemand?Where are they mentioned?
Point out source filename links under Sources opening in a new tab.
- Delete the only document and confirm.
- Point out:
Index cleared.No documents found.- Ask disabled again
- Upload a document and ask one final question to show full recovery.