A simple yet powerful local AI assistant that runs entirely on your machine. Built for learning and experimentation, Little MCP combines the power of open-source LLMs with advanced Retrieval-Augmented Generation (RAG) to create an intelligent chatbot that can work with your personal documents and provide real-time information.
- Local LLM Integration: Powered by Ollama for complete privacy and offline functionality
- Option: Anthropic LLM Claude: Required API key
- π» Interface Options: Graphic web interface or classic text interface.
- RAG System: Query and extract information from your PDF documents using vector embeddings
- MCP Server/Client Architecture: Demonstrates Model Context Protocol implementation with FastAPI
- Dual option thinking/ nothinking mode: show thinking process.
- Multi-Tool Agent: Access to multiple tools including:
- Document Q&A system (RAG-based)
- local SQL database (Mariadb)
- Real-time weather information (OpenWeather API)
- Current date and time for any city
- Arithmetic calculations
- Conversational Memory: Maintains context throughout your chat session
- Vector Store Persistence: Efficiently stores and reuses document embeddings
- Python 3.8+
- Ollama installed and running locally
- OpenWeather API key (free tier available at openweathermap.org)
Download these models before running the application:
ollama pull qwen3:4b (not required when use Claude)
ollama pull nomic-embed-textIn your first terminal:
python mcp_server.pyYou should see:
Starting MCP Server ...
INFO: Uvicorn running on http://127.0.0.1:8000
In a second terminal:
[activate your env if not jet] : source ../.venv/bin/activate]
# Local Ollama silent
python little_mcp.py [graph]
# Local Ollama thinking
python little_mcp.py --think [graph]
# Use Claude (API key required)
python little_mcp.py --provider anthropic [graph]
# Use Claude with thinking mode
python little_mcp.py --provider anthropic --think [graph]
# Override Claude model
python little_mcp.py --provider anthropic --model claude-opus-4-5 [--think]
note: add graph parameter for graphical interface
When use graph interface open your browser and run local URL:
http://127.0.0.1:7860You: What's the weather in Paris?
You: What time is it in Tokyo?
You: What information is in my document about candidates?
You: Calculate ADD, 15, 27
Type quit, exit, or bye to close the client application.
The FastAPI-based server provides RESTful endpoints for various tools:
DateTime Tool:
- Uses Nominatim for geocoding city names
- Determines timezone using TimezoneFinder
- Returns current date, time, and day of week
Weather Tool:
- Integrates with OpenWeather API
- Returns current weather conditions in metric units
- Includes temperature, humidity, and weather description
Calculator Tool:
- Supports basic arithmetic operations (ADD, SUB, MUL, DIV)
- Input format:
"OPERATION, NUM1, NUM2" - Example:
"ADD, 5, 3"returns8
The LangChain-based client orchestrates multiple components:
RAG System:
- Loads your PDF document using PyPDFLoader
- Splits the document into manageable chunks
- Converts chunks into vector embeddings using Ollama's nomic-embed-text model
- Stores embeddings in a Chroma vector database
- Retrieves relevant context when you ask questions
- Generates answers using the LLM with retrieved context
Agent Architecture:
- Analyzes your queries to determine the best tool to use
- Routes questions about your document to the RAG system
- Uses MCP server tools for real-time information
- Falls back to general knowledge when no tool is suitable
- Maintains conversation history for context-aware responses
Edit the LLM constant in little_mcp.py:
LLM = "llama2" # or mistral, mixtral, phi, etc.Server Side (mcp_server.py):
- Create your tool function:
def get_my_tool(param: str):
# Your logic here
return {"result": "data"}- Add an API endpoint:
@app.get("/get_my_tool")
def api_get_my_tool(myParam: str = Query(...)):
result = get_my_tool(myParam)
return resultClient Side (little_mcp.py):
Add to the mcp_tools_config list:
{
"name": "my_tool",
"description": "Description for the agent to understand when to use this tool",
"function_name": "get_my_tool"
}Modify in little_mcp.py:
# Number of document chunks to retrieve
retriever = self.vector_store.as_retriever(search_kwargs={'k': 3})
# Chunk size and overlap
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200
)Little_MCP/
βββ mcp_server.py # FastAPI MCP server
βββ little_mcp.py # LangChain client application
βββ data/ # PDF documents directory
βββ chroma_db_rag/ # Vector store (auto-generated)
βββ .env # Environment variables (API keys)
βββ requirements.txt # Python dependencies
βββ README.md # This file
Server won't start:
- Check if port 8000 is already in use
- Verify your
.envfile contains the API key
Weather tool fails:
- Confirm your OpenWeather API key is valid and active
- Check your internet connection
Vector store issues:
- Delete the
chroma_db_ragdirectory to rebuild from scratch
Ollama connection errors:
- Ensure Ollama is running (
ollama serve) - Verify models are downloaded (
ollama list)
PDF not found:
- Verify the path in
PDF_DOCUMENT_PATH - Ensure the file exists in the
datadirectory
Client can't connect to server:
- Confirm the server is running on port 8000
- Check the
SERVER_URLconfiguration
You: What's the weather in London?
Assistant: The current weather in London is 12Β°C with light rain...
You: What time is it in New York?
Assistant: In New York, it's currently 2024-10-22 14:30:15 (Tuesday)...
You: Calculate ADD, 25, 17
Assistant: 42.0
You: What does my document say about candidate scores?
Assistant: Based on the document, the candidates have the following scores...
Contributions are welcome! This project is designed for learning and experimentation. Feel free to:
- Add new tools and capabilities
- Improve the RAG system
- Enhance the agent's reasoning
- Add support for more document types
- Implement additional MCP endpoints
MIT License
Copyright (c) 2025
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Built with:
- LangChain - Agent framework
- FastAPI - Modern web framework for APIs
- Ollama - Local LLM runtime
- ChromaDB - Vector database
- OpenWeatherMap - Weather data API
- Model Context Protocol (MCP) - Communication standard
Version: 0.1.01
Made with β€οΈ for AI learning and experimentation