AgentMesh — Multi-Agent Orchestration with MCP & A2A

A production-style multi-agent orchestration prototype where a planner agent decomposes complex requests, discovers specialised agents through Agent Cards, delegates tasks through an A2A-inspired protocol, and gives agents uniform access to tools through MCP-style servers.

Why AgentMesh exists

Single-agent assistants are good at simple requests, but they often become brittle when a task needs planning, retrieval, data lookup, web-style search, evidence synthesis, and final summarisation. AgentMesh demonstrates a cleaner architecture:

one Planner Agent decomposes the user request,
specialised agents execute sub-tasks,
agents publish Agent Cards for discovery,
delegation happens through an A2A-style task interface,
tools are exposed through MCP-style servers,
evaluation compares single-agent and multi-agent execution on multi-step tasks.

The goal is not to imitate every line of a protocol specification. The goal is to show a practical engineering pattern that recruiters and engineering teams can understand: separate agent communication from tool execution.

What this project demonstrates

Capability	What AgentMesh implements
Agent discovery	Agent Cards served from a registry and API endpoint
A2A-style delegation	Task envelopes, task states, agent messages, artifacts, and trace IDs
MCP-style tool access	Uniform `list_tools` and `call_tool` interface for SQL, vector search, and REST/mock search
Multi-agent orchestration	Planner delegates to retrieval, search, SQL, and summarisation agents
Observability	Step-level traces, tool calls, latency, agent routing decisions
Evaluation	Single-agent vs multi-agent comparison on 50+ multi-step tasks
Cloud-readiness	AWS SAM/Terraform skeleton for API Gateway, Lambda-style agents, DynamoDB state, Bedrock model calls

Architecture

flowchart LR
    U[User Request] --> API[FastAPI Gateway]
    API --> P[Planner Agent]
    P --> REG[Agent Registry / Agent Cards]
    REG --> R[Retrieval Agent]
    REG --> S[Search Agent]
    REG --> Q[SQL Agent]
    REG --> SUM[Summarisation Agent]
    R --> MCP1[MCP Vector Search Server]
    S --> MCP2[MCP REST/Search Server]
    Q --> MCP3[MCP SQL Server]
    R --> SUM
    S --> SUM
    Q --> SUM
    SUM --> API
    API --> OUT[Final Answer + Trace]

Demo: multi-agent run

agentmesh demo "Which enterprise customers had open critical tickets, what internal docs explain the fix, and what should the account manager send them?"

Example result:

Planner created 4 steps:
1. Query SQL tickets and accounts
2. Retrieve internal incident/runbook knowledge
3. Search external-style product status snippets
4. Summarise customer-facing action plan

Final answer:
Two enterprise customers have unresolved critical issues. The strongest remediation evidence is in the cache invalidation runbook and the API timeout incident note. The account manager should send a short update acknowledging impact, explaining the mitigation window, and offering a technical follow-up.

Evaluation snapshot

The repository includes a deterministic evaluation set with 50 multi-step tasks. The benchmark compares a single generic agent against the routed multi-agent system.

System	Tool-selection accuracy	Task completion	Avg. steps	Avg. latency
Single generic agent	62.0%	58.0%	2.1	0.31s
AgentMesh multi-agent	88.0%	84.0%	3.7	0.46s

Interpretation: the multi-agent system takes slightly more steps, but it selects tools more accurately and completes compound requests more reliably. The included evaluator is deterministic so the project can be run without paid LLM APIs.

Repository structure

agentmesh/
├── src/agentmesh/
│   ├── api.py                    # FastAPI gateway
│   ├── cli.py                    # CLI for demo, run, evaluate
│   ├── orchestrator.py           # Planner-led multi-agent orchestration
│   ├── state.py                  # In-memory/DynamoDB-style task state
│   ├── schemas.py                # Shared data contracts
│   ├── agents/
│   │   ├── base.py               # Base agent contract
│   │   ├── planner.py            # Planner / task decomposer
│   │   ├── retrieval_agent.py    # Knowledge retrieval specialist
│   │   ├── search_agent.py       # External-search specialist
│   │   ├── sql_agent.py          # Structured data specialist
│   │   └── summarizer_agent.py   # Final synthesis specialist
│   ├── a2a/
│   │   ├── cards.py              # Agent Card definitions
│   │   ├── registry.py           # Agent discovery registry
│   │   └── protocol.py           # A2A-style task/message envelopes
│   ├── mcp/
│   │   ├── server.py             # MCP-style server interface
│   │   └── client.py             # MCP-style client interface
│   ├── tools/
│   │   ├── sql_tools.py          # SQLite tool server
│   │   ├── vector_tools.py       # TF-IDF knowledge retrieval tool server
│   │   └── rest_tools.py         # Mock REST/search/status tool server
│   ├── evaluation/
│   │   ├── dataset.py            # Generates/evaluates 50+ tasks
│   │   └── evaluator.py          # Single vs multi-agent scoring
│   └── cloud/
│       ├── bedrock_client.py     # Bedrock-compatible abstraction
│       └── dynamodb_state.py     # DynamoDB-state adapter skeleton
├── data/
│   ├── knowledge/                # Demo runbooks and incident notes
│   └── eval/                     # Multi-step evaluation dataset
├── docs/
│   ├── ARCHITECTURE.md
│   ├── MCP_A2A_DESIGN.md
│   ├── EVALUATION.md
│   └── AWS_DEPLOYMENT.md
├── infra/
│   ├── aws-sam/template.yaml
│   └── terraform/main.tf
├── tests/
├── reports/benchmark_summary.md
├── Dockerfile
├── docker-compose.yml
├── Makefile
└── README.md

Quick start

1. Clone

git clone https://github.com/<your-username>/agentmesh.git
cd agentmesh

2. Create environment

python -m venv .venv
source .venv/bin/activate      # macOS/Linux
# .venv\Scripts\activate       # Windows PowerShell

3. Install

pip install -e ".[dev]"

4. Run local demo

agentmesh demo "Which customers have critical open tickets and what should we tell them?"

5. Start API

uvicorn agentmesh.api:app --reload

Open the interactive API docs:

http://localhost:8000/docs

API example

curl -X POST http://localhost:8000/run \
  -H "Content-Type: application/json" \
  -d '{
    "request": "Find critical customer issues, retrieve the right runbook, and draft an account-manager summary.",
    "mode": "multi_agent"
  }'

Response shape:

{
  "task_id": "task_...",
  "mode": "multi_agent",
  "final_answer": "...",
  "steps": [...],
  "tool_calls": [...],
  "metrics": {
    "latency_seconds": 0.46,
    "agents_used": 4,
    "tools_called": 5
  }
}

Agent Cards

Each agent publishes a card with capabilities, input/output modes, and skills.

{
  "name": "retrieval-agent",
  "description": "Retrieves relevant internal knowledge from vector-search tools.",
  "skills": [
    {
      "id": "retrieve_knowledge",
      "name": "Retrieve Knowledge",
      "description": "Finds relevant runbooks, incidents, and policy snippets."
    }
  ]
}

This makes routing explicit: the planner does not need hard-coded implementation details. It can discover agent capabilities and delegate based on task intent.

MCP-style tool servers

AgentMesh exposes tools behind a uniform interface:

tools = await mcp_client.list_tools(server="sql")
result = await mcp_client.call_tool(
    server="sql",
    tool="query_tickets",
    arguments={"severity": "critical", "status": "open"}
)

Included tool servers:

MCP-style server	Tools
`sql`	`query_tickets`, `query_accounts`, `list_tables`
`vector`	`retrieve_documents`, `list_documents`
`rest`	`search_status`, `get_service_health`, `fetch_competitor_signal`

Evaluation

Run the benchmark:

agentmesh evaluate --output reports/eval_results.csv

The evaluator measures:

tool-selection accuracy
agent-routing accuracy
task completion
average step count
latency
unnecessary-tool rate

The project intentionally includes both a single-agent baseline and the multi-agent orchestrator so the improvement is measurable instead of only described.

AWS deployment design

The local architecture maps cleanly to AWS:

Local component	AWS equivalent
FastAPI gateway	API Gateway + Lambda adapter
Agents	Lambda functions
Shared state	DynamoDB
Tool calls	MCP server Lambdas / internal services
LLM reasoning	Amazon Bedrock foundation models
Logs/traces	CloudWatch Logs

The repository includes both AWS SAM and Terraform starter infrastructure. It is intentionally lightweight, so it can be reviewed as architecture without requiring cloud credentials.

Design principles

Agent communication and tool execution are separate concerns.
Agent discovery should be metadata-driven, not hard-coded.
Tools should be added without rewriting agent logic.
Planning should be observable, not hidden in one opaque prompt.
Evaluation should compare architectures, not just model outputs.

Roadmap

Replace in-process A2A transport with HTTP-to-HTTP agent calls
Add signed Agent Cards and capability allowlists
Add true MCP SDK server implementations
Add Bedrock Converse API execution path
Add LangGraph backend adapter
Add distributed tracing with OpenTelemetry
Add React trace viewer
Add human approval step for sensitive tool calls

Responsible use

AgentMesh is a prototype for agent orchestration and evaluation. It should not be connected to sensitive enterprise tools without authentication, authorization, audit logging, tool allowlists, input validation, and human approval for high-impact actions.

License

MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AgentMesh — Multi-Agent Orchestration with MCP & A2A

Why AgentMesh exists

What this project demonstrates

Architecture

Demo: multi-agent run

Evaluation snapshot

Repository structure

Quick start

1. Clone

2. Create environment

3. Install

4. Run local demo

5. Start API

API example

Agent Cards

MCP-style tool servers

Evaluation

AWS deployment design

Design principles

Recommended GitHub topics

Roadmap

Responsible use

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
docs		docs
infra		infra
reports		reports
scripts		scripts
src/agentmesh		src/agentmesh
tests		tests
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

AgentMesh — Multi-Agent Orchestration with MCP & A2A

Why AgentMesh exists

What this project demonstrates

Architecture

Demo: multi-agent run

Evaluation snapshot

Repository structure

Quick start

1. Clone

2. Create environment

3. Install

4. Run local demo

5. Start API

API example

Agent Cards

MCP-style tool servers

Evaluation

AWS deployment design

Design principles

Recommended GitHub topics

Roadmap

Responsible use

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages