Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
318 changes: 32 additions & 286 deletions notebooks/microsoft-iq-in-foundry.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -391,12 +391,6 @@
"| **Auth model** | A **stored header** (`x-apikey`) carrying your Web IQ MCP key. |\n",
"| **Env vars** | `WEB_IQ_MCP_API_KEY` (**secret** — read from the environment, never committed). |\n",
"\n",
"> **Why stored headers and not a Foundry connection?** The MCP server supports\n",
"> both a `foundryConnection` auth variant and a `storedHeaders` variant. As of\n",
"> this preview, the `foundryConnection` form returns **HTTP 502 \"Could not\n",
"> resolve the tool manifest\"** on KB Retrieve (a known preview bug). The\n",
"> `storedHeaders` form (below) sends `x-apikey` directly and works reliably.\n",
"\n",
"[MCP Server knowledge source (Python)](https://learn.microsoft.com/en-us/azure/search/agentic-knowledge-source-overview)\n"
]
},
Expand Down Expand Up @@ -433,8 +427,7 @@
" description=\"Web IQ -- Microsoft Grounding MCP server (web, news, videos, browse).\",\n",
" mcp_server_parameters=McpServerKnowledgeSourceParameters(\n",
" server_url=WEB_IQ_MCP_SERVER_URL,\n",
" # storedHeaders auth (x-apikey). NOT foundryConnection -- that variant\n",
" # currently 502s (\"Could not resolve the tool manifest\") on KB Retrieve.\n",
" # storedHeaders auth: send the Web IQ x-apikey directly.\n",
" authentication=McpServerStoredHeadersAuthentication(\n",
" stored_headers_parameters=McpServerStoredHeadersParameters(\n",
" headers={\"x-apikey\": web_iq_key},\n",
Expand Down Expand Up @@ -518,6 +511,13 @@
" knowledge_sources=[KnowledgeSourceReference(name=n) for n in kb_sources],\n",
" retrieval_reasoning_effort=KnowledgeRetrievalLowReasoningEffort(),\n",
" output_mode=KnowledgeRetrievalOutputMode.ANSWER_SYNTHESIS,\n",
" retrieval_instructions=(\n",
" \"Route each subquery to the Microsoft IQ most likely to answer it: \"\n",
" \"use Work IQ for internal, people, and collaboration context; use \"\n",
" \"Fabric IQ for business facts from the ontology (fleet, routes, \"\n",
" \"operations); use Web IQ for current events and public information. \"\n",
" \"Use several sources when a question spans more than one.\"\n",
" ),\n",
" answer_instructions=(\n",
" \"Answer using only the retrieved content across all sources. \"\n",
" \"When a question spans work content, the airline ontology, and the web, \"\n",
Expand All @@ -536,24 +536,14 @@
"source": [
"## 7 · Query the knowledge layer\n",
"\n",
"This is the payoff. We run **four** retrievals against the one Foundry IQ KB:\n",
"\n",
"1. **7a Work IQ** — a work question, scoped to Work IQ.\n",
"2. **7b Fabric IQ** — an airline-ontology question, scoped to Fabric IQ.\n",
"3. **7c Web IQ** — a fresh-web question, scoped to Web IQ.\n",
"4. **7d Cross-source** — one question that **joins ≥2 IQs**, with all sources in\n",
" scope.\n",
"That's the whole build. Querying is a single call: send a question to the\n",
"Knowledge Base and Foundry IQ plans subqueries, routes them across Work IQ,\n",
"Fabric IQ, and Web IQ (steered by the `retrieval_instructions` from §6),\n",
"reranks, and returns one cited answer.\n",
"\n",
"Each cell prints the synthesized answer, the **reference count per source**, and\n",
"a sample extract — so you can *see* each Microsoft IQ grounding the answer.\n",
"\n",
"> **Per-user IQs need a caller token.** Both **Work IQ** and **Fabric IQ** are\n",
"> identity-scoped: each retrieve must carry a user token via\n",
"> `query_source_authorization` (audience `https://search.azure.com`). **Web IQ**\n",
"> authenticates with its own stored `x-apikey`, so it grounds with or without\n",
"> the token. The setup cell below mints the signed-in user's token with\n",
"> `DefaultAzureCredential`. See\n",
"> [Retrieve from a knowledge base (Python)](https://learn.microsoft.com/en-us/azure/search/agentic-retrieval-how-to-retrieve?pivots=python).\n"
"> Work IQ and Fabric IQ are per-user: pass the signed-in user's token via\n",
"> `query_source_authorization` (audience `https://search.azure.com`). Web IQ\n",
"> grounds with its own stored key.\n"
]
},
{
Expand All @@ -563,280 +553,36 @@
"metadata": {},
"outputs": [],
"source": [
"import json as _json\n",
"\n",
"from azure.identity import DefaultAzureCredential\n",
"from azure.search.documents.knowledgebases import KnowledgeBaseRetrievalClient\n",
"from azure.search.documents.knowledgebases.models import (\n",
" FabricOntologyKnowledgeSourceParams,\n",
" KnowledgeBaseMessage,\n",
" KnowledgeBaseMessageTextContent,\n",
" KnowledgeBaseRetrievalRequest,\n",
" McpServerKnowledgeSourceParams,\n",
" WorkIQKnowledgeSourceParams,\n",
")\n",
"\n",
"retrieval_client = KnowledgeBaseRetrievalClient(\n",
"kb_client = KnowledgeBaseRetrievalClient(\n",
" endpoint=SEARCH_ENDPOINT,\n",
" credential=credential,\n",
" knowledge_base_name=KB_NAME,\n",
" credential=DefaultAzureCredential(),\n",
")\n",
"\n",
"\n",
"def user_query_authorization() -> Optional[str]:\n",
" \"\"\"Mint the signed-in user's token for the per-user IQs (Work IQ, Fabric IQ).\"\"\"\n",
" try:\n",
" token = DefaultAzureCredential().get_token(\"https://search.azure.com/.default\").token\n",
" print(\"query_source_authorization : acquired user token\")\n",
" return token\n",
" except Exception as exc: # noqa: BLE001 -- best-effort; Web IQ still works\n",
" print(f\"query_source_authorization : unavailable ({exc}); Work IQ + Fabric IQ will return no references\")\n",
" return None\n",
"\n",
"\n",
"query_auth = user_query_authorization()\n",
"\n",
"\n",
"def ks_params_for(name: str, reranker_threshold: Optional[float] = None):\n",
" \"\"\"Per-kind retrieve params for the federated IQ wired into the KB.\"\"\"\n",
" common = dict(\n",
" knowledge_source_name=name,\n",
" include_references=True,\n",
" include_reference_source_data=True,\n",
" )\n",
" if reranker_threshold is not None:\n",
" common[\"reranker_threshold\"] = reranker_threshold\n",
" if name == KS_WORK_IQ:\n",
" return WorkIQKnowledgeSourceParams(**common)\n",
" if name == KS_FABRIC_IQ:\n",
" return FabricOntologyKnowledgeSourceParams(**common)\n",
" if name == KS_WEB_IQ:\n",
" return McpServerKnowledgeSourceParams(**common)\n",
" return None\n",
"\n",
"\n",
"def retrieve(question: str, sources: list[str], *, reranker_threshold=None, max_runtime_seconds: int = 180):\n",
" \"\"\"Run one KB retrieval, scoped to `sources` (a subset of the KB's IQs).\"\"\"\n",
" params = [ks_params_for(n, reranker_threshold) for n in sources]\n",
" request = KnowledgeBaseRetrievalRequest(\n",
" messages=[\n",
" KnowledgeBaseMessage(\n",
" role=\"user\",\n",
" content=[KnowledgeBaseMessageTextContent(text=question)],\n",
" )\n",
" ],\n",
" knowledge_source_params=[p for p in params if p],\n",
" include_activity=True,\n",
" max_runtime_in_seconds=max_runtime_seconds,\n",
" )\n",
" # query_source_authorization auths the per-user IQs (Work IQ, Fabric IQ).\n",
" return retrieval_client.retrieve(request, query_source_authorization=query_auth)\n",
"\n",
"\n",
"def answer_text(result) -> str:\n",
" parts = []\n",
" for message in (result.response or []):\n",
" for content in (message.content or []):\n",
" text = getattr(content, \"text\", None)\n",
" if text:\n",
" parts.append(text)\n",
" return \"\\n\\n\".join(parts)\n",
"\n",
"\n",
"def describe_reference(ref) -> str:\n",
" \"\"\"Pull a human-readable snippet out of a reference, per IQ type.\"\"\"\n",
" rtype = getattr(ref, \"type\", None)\n",
" src = ref.source_data or {}\n",
" if not isinstance(src, dict):\n",
" return str(src)[:240]\n",
" if rtype == \"fabricOntology\": # Fabric IQ\n",
" bits = []\n",
" if src.get(\"fabricAnswer\"):\n",
" bits.append(str(src[\"fabricAnswer\"]))\n",
" if src.get(\"fabricRawData\"):\n",
" bits.append(\"data: \" + str(src[\"fabricRawData\"])[:160])\n",
" return \" | \".join(bits) or _json.dumps(src)[:240]\n",
" if rtype == \"workIQ\": # Work IQ\n",
" texts = [e.get(\"text\") for e in (src.get(\"extracts\") or []) if e.get(\"text\")]\n",
" more = [a.get(\"seeMoreWebUrl\") for a in (src.get(\"attributions\") or []) if a.get(\"seeMoreWebUrl\")]\n",
" out = (\" \".join(texts) or src.get(\"content\") or src.get(\"text\") or \"\")[:200]\n",
" if more:\n",
" out += f\" (see more: {more[0]})\"\n",
" return out or _json.dumps(src)[:240]\n",
" return (src.get(\"title\") or \"\") + \" \" + (src.get(\"content\") or src.get(\"text\") or _json.dumps(src))[:200]\n",
"\n",
"\n",
"def refs_by_type(result) -> dict:\n",
" counts: dict = {}\n",
" for r in (result.references or []):\n",
" t = getattr(r, \"type\", None)\n",
" counts[t] = counts.get(t, 0) + 1\n",
" return counts\n",
"\n",
"\n",
"def report(label: str, result, *, max_answer: int = 600, max_refs: int = 3) -> None:\n",
" if result is None:\n",
" print(f\"=== {label} ===\\n[skipped] source not created this run\\n\")\n",
" return\n",
" refs = result.references or []\n",
" print(f\"=== {label} ===\")\n",
" print(f\"references: {len(refs)} by_type: {refs_by_type(result)}\")\n",
" ans = answer_text(result).strip()\n",
" if ans:\n",
" print(f\"\\nANSWER\\n------\\n{ans[:max_answer]}{'...' if len(ans) > max_answer else ''}\")\n",
" for ref in refs[:max_refs]:\n",
" snippet = describe_reference(ref).strip().replace(\"\\n\", \" \")\n",
" print(f\" [ref_id:{ref.id}] type={getattr(ref, 'type', None)!r} :: {snippet[:200]}\")\n",
" print()\n"
]
},
{
"cell_type": "markdown",
"id": "fab74957",
"metadata": {},
"source": [
"### 7a · Work IQ — a work question\n",
"\n",
"Scope the retrieve to Work IQ alone and ask about internal work content. A\n",
"non-empty `references` list (type `workIQ`) proves Work IQ grounded the answer.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6d2442f5",
"metadata": {},
"outputs": [],
"source": [
"work_question = (\n",
" \"Summarize what we've discussed internally about transatlantic route \"\n",
" \"planning, long-haul fleet decisions, or premium-cabin strategy.\"\n",
")\n",
"\n",
"if KS_WORK_IQ in kb_sources:\n",
" res_work = retrieve(work_question, [KS_WORK_IQ])\n",
"else:\n",
" res_work = None\n",
" skip(\"7a Work IQ\", \"Work IQ source not created -- enable it in §3\")\n",
"\n",
"report(\"7a · Work IQ\", res_work)\n"
]
},
{
"cell_type": "markdown",
"id": "097018a8",
"metadata": {},
"source": [
"### 7b · Fabric IQ — an airline-ontology question\n",
"\n",
"Scope to Fabric IQ and ask a **narrow, aggregate** question the ontology can\n",
"answer in business terms. We pass `reranker_threshold=0.0` so on-topic ontology\n",
"rows aren't filtered out.\n",
"\n",
"> **Keep Fabric questions narrow.** A broad ontology question (\"list every\n",
"> aircraft *and* its routes\") can trip a **\"data is too large to process in a\n",
"> single request\"** response — Fabric tries to pull a whole entity table.\n",
"> Aggregate or scoped questions (\"how many aircraft *by manufacturer*?\") return\n",
"> clean `fabricAnswer` + `fabricRawData`.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cbc2f1ae",
"metadata": {},
"outputs": [],
"source": [
"fabric_question = (\n",
" \"Using our airline ontology, how many aircraft are in the fleet, grouped by \"\n",
" \"manufacturer?\"\n",
")\n",
"\n",
"if KS_FABRIC_IQ in kb_sources:\n",
" # reranker_threshold=0.0 keeps on-ontology rows that a higher bar would drop.\n",
" # A narrow, aggregate question avoids the \"data too large\" Fabric error you\n",
" # get when a broad query tries to pull an entire entity table at once.\n",
" res_fabric = retrieve(fabric_question, [KS_FABRIC_IQ], reranker_threshold=0.0)\n",
"else:\n",
" res_fabric = None\n",
" skip(\"7b Fabric IQ\", \"Fabric IQ source not created -- enable it in §4\")\n",
"\n",
"report(\"7b · Fabric IQ\", res_fabric)\n"
]
},
{
"cell_type": "markdown",
"id": "385afa20",
"metadata": {},
"source": [
"### 7c · Web IQ — a fresh-web question\n",
"\n",
"Scope to Web IQ and ask something only the live web can answer. References of\n",
"type `web` / `mcpServer` prove Web IQ grounded the answer. (Web IQ needs no user\n",
"token — it uses its stored `x-apikey`.)\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7b432927",
"metadata": {},
"outputs": [],
"source": [
"web_question = (\n",
" \"What's the latest public news on long-haul, transatlantic aircraft and \"\n",
" \"route announcements from major carriers?\"\n",
")\n",
"\n",
"if KS_WEB_IQ in kb_sources:\n",
" res_web = retrieve(web_question, [KS_WEB_IQ])\n",
"else:\n",
" res_web = None\n",
" skip(\"7c Web IQ\", \"Web IQ source not created -- set WEB_IQ_MCP_API_KEY in §5 (waitlist: https://aka.ms/webiq-waitlist)\")\n",
"\n",
"report(\"7c · Web IQ\", res_web)\n"
]
},
{
"cell_type": "markdown",
"id": "3a47adcd",
"metadata": {},
"source": [
"### 7d · Cross-source — join the IQs\n",
"\n",
"Now the hero query. One question that **no single IQ can answer alone**: it pairs\n",
"our **fleet composition by manufacturer** (Fabric IQ) with the **latest public\n",
"news** on new long-haul aircraft and transatlantic routes (Web IQ), and folds in\n",
"any **internal context** (Work IQ). All sources are in scope; the planner fans\n",
"out, reranks, and synthesizes one cited answer. The activity trace shows how the\n",
"turn was decomposed.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fa73e00a",
"metadata": {},
"outputs": [],
"source": [
"cross_question = (\n",
" \"I'm prepping a transatlantic route-planning review. Using our airline \"\n",
" \"ontology, what is our fleet composition by manufacturer? Then combine that \"\n",
" \"with the latest public news on new long-haul aircraft and transatlantic \"\n",
" \"routes — which manufacturers in the news do we already operate? Add \"\n",
" \"anything we've discussed internally.\"\n",
"request = KnowledgeBaseRetrievalRequest(\n",
" messages=[\n",
" KnowledgeBaseMessage(\n",
" role=\"user\",\n",
" content=[KnowledgeBaseMessageTextContent(\n",
" text=\"What should I know before our transatlantic route-planning review?\"\n",
" )],\n",
" ),\n",
" ],\n",
")\n",
"\n",
"# reranker_threshold=0.0 lets each IQ's top row survive the shared rerank so the\n",
"# answer is grounded across sources, not dominated by one.\n",
"res_cross = retrieve(cross_question, kb_sources, reranker_threshold=0.0)\n",
"report(\"7d · Cross-source\", res_cross, max_answer=900, max_refs=6)\n",
"# Work IQ + Fabric IQ are per-user; pass the caller's token. Web IQ uses its own key.\n",
"user_token = DefaultAzureCredential().get_token(\"https://search.azure.com/.default\").token\n",
"result = kb_client.retrieve(request, query_source_authorization=user_token)\n",
"\n",
"# The planner activity trace: how the one turn was decomposed across the IQs.\n",
"activity = [a.as_dict() if hasattr(a, \"as_dict\") else dict(a) for a in (res_cross.activity or [])]\n",
"print(\"PLANNER ACTIVITY (truncated)\")\n",
"print(\"----------------------------\")\n",
"print(_json.dumps(activity, indent=2)[:2500])\n"
"print(result.response[0].content[0].text)\n"
]
},
{
Expand Down Expand Up @@ -938,8 +684,8 @@
"## 10 · Next steps\n",
"\n",
"You built a **Microsoft IQ knowledge layer** — Work IQ, Fabric IQ, and Web IQ\n",
"federated by **Foundry IQ** into one Knowledge Base, proven to answer from each\n",
"IQ and across them. From here:\n",
"federated by **Foundry IQ** into one Knowledge Base you query with a single\n",
"call. From here:\n",
"\n",
"- **Tour every KS type.** The companion recipe\n",
" [Mastering Foundry IQ](mastering-foundry-iq) walks indexed, uploaded, and\n",
Expand Down