Skip to content

Commit 73e1787

Browse files
committed
Improve README with descriptive intro and fixes
- Added Overview section with problem statement and solution - Added How It Works explanation of the agentic RAG flow - Added Component Selection table explaining technology choices - Fixed visual vertical bar alignment in architecture diagram - Added note for users wanting to use their own dataset Signed-off-by: Patrick Moorhead <pmoorhead@nvidia.com>
1 parent 8fb26be commit 73e1787

1 file changed

Lines changed: 44 additions & 2 deletions

File tree

examples/mcp_rag_demo/README.md

Lines changed: 44 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,11 +17,52 @@ limitations under the License.
1717

1818
# MCP RAG Demo with NVIDIA NIMs
1919

20-
This example demonstrates how to expose custom tools via the Model Context Protocol (MCP) using NVIDIA NeMo Agent toolkit with NVIDIA NIM integration. It showcases semantic search, filtering, and reranking of support tickets using NVIDIA NIMs for embedding, LLM reasoning, and reranking.
20+
## Overview
21+
22+
### The Problem
23+
24+
Enterprise AI applications need to connect Large Language Models (LLMs) to external data sources and tools. However, each integration typically requires custom code, leading to:
25+
26+
- **Fragmented tool ecosystems** - Different frameworks require different integration patterns
27+
- **Vendor lock-in** - Tools built for one AI platform don't work with others
28+
- **Security complexity** - Each integration needs its own authentication handling
29+
- **Maintenance burden** - Updates to tools require changes across multiple integrations
30+
31+
### The Solution
32+
33+
This example demonstrates how to solve these challenges using the **Model Context Protocol (MCP)** - an open standard that enables AI applications to securely connect to external tools through a unified interface. By exposing tools via MCP, they become instantly accessible to any MCP-compatible client including Claude Desktop, Cursor IDE, and custom agents.
34+
35+
### How It Works
36+
37+
The demo implements an **Agentic RAG (Retrieval-Augmented Generation)** system for searching support tickets:
38+
39+
1. **User asks a question** via the chat UI (e.g., "Find critical GPU driver issues")
40+
2. **ReAct Agent reasons** about which tools to use and in what order
41+
3. **MCP Tools execute** - semantic search, filtering, and reranking operations
42+
4. **NVIDIA NIMs process** the requests using GPU-accelerated AI models
43+
5. **Agent synthesizes** the results into a coherent response
44+
45+
### Component Selection
46+
47+
| Component | Technology | Why This Choice |
48+
|-----------|------------|-----------------|
49+
| **Protocol** | MCP (Streamable HTTP) | Open standard with auth support, works with any MCP client |
50+
| **Agent Framework** | NeMo Agent Toolkit | Native MCP server/client, YAML config, production-ready |
51+
| **Vector Database** | Milvus | GPU-accelerated with cuVS, scales to billions of vectors |
52+
| **Embeddings** | `nvidia/nv-embedqa-e5-v5` | High-quality 1024-dim embeddings optimized for Q&A retrieval |
53+
| **LLM** | `meta/llama-3.1-70b-instruct` | Strong reasoning for agent orchestration and response generation |
54+
| **Reranker** | `nvidia/llama-3.2-nv-rerankqa-1b-v2` | Improves retrieval precision by reordering results by relevance |
55+
56+
---
2157

2258
## Table of Contents
2359

2460
- [MCP RAG Demo with NVIDIA NIMs](#mcp-rag-demo-with-nvidia-nims)
61+
- [Overview](#overview)
62+
- [The Problem](#the-problem)
63+
- [The Solution](#the-solution)
64+
- [How It Works](#how-it-works)
65+
- [Component Selection](#component-selection)
2566
- [Table of Contents](#table-of-contents)
2667
- [Key Features](#key-features)
2768
- [Architecture](#architecture)
@@ -67,7 +108,7 @@ This demo uses a 3-terminal architecture:
67108

68109
```
69110
┌─────────────┐ REST ┌─────────────────┐
70-
│ NAT UI │ ◄──────────────────► │ NAT UI Server │
111+
│ NAT UI │ ◄──────────────────► │ NAT UI Server │
71112
│ (Browser) │ │ (MCP Client) │
72113
└─────────────┘ └────────┬────────┘
73114
@@ -127,6 +168,7 @@ docker-compose up -d
127168
```
128169

129170
### Load Sample Data
171+
**Note:** The sample dataset is synthetic. To use your own data, modify `load_support_tickets.py` with your Milvus connection and data schema, then update the tool queries in `register.py` to match your fields.
130172

131173
```bash
132174
python examples/mcp_rag_demo/scripts/load_support_tickets.py

0 commit comments

Comments
 (0)