From 5f5dbf3c62292b7fa07abb18e0fa3452200d1818 Mon Sep 17 00:00:00 2001
From: Farzad <fsunavala@microsoft.com>
Date: Fri, 5 Jun 2026 16:31:13 -0500
Subject: [PATCH] Polish Microsoft IQ recipe: stored-header auth + minimal
 query

- Document only the stored-header (x-apikey) approach for Web IQ
- Set retrieval_instructions on the KB to steer source routing
- Reduce the query section to one minimal retrieve() call (KB auto-routes)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---
 notebooks/microsoft-iq-in-foundry.ipynb | 318 +++---------------------
 1 file changed, 32 insertions(+), 286 deletions(-)

diff --git a/notebooks/microsoft-iq-in-foundry.ipynb b/notebooks/microsoft-iq-in-foundry.ipynb
index 1128f69..11b0856 100644
--- a/notebooks/microsoft-iq-in-foundry.ipynb
+++ b/notebooks/microsoft-iq-in-foundry.ipynb
@@ -391,12 +391,6 @@
     "| **Auth model** | A **stored header** (`x-apikey`) carrying your Web IQ MCP key. |\n",
     "| **Env vars** | `WEB_IQ_MCP_API_KEY` (**secret** — read from the environment, never committed). |\n",
     "\n",
-    "> **Why stored headers and not a Foundry connection?** The MCP server supports\n",
-    "> both a `foundryConnection` auth variant and a `storedHeaders` variant. As of\n",
-    "> this preview, the `foundryConnection` form returns **HTTP 502 \"Could not\n",
-    "> resolve the tool manifest\"** on KB Retrieve (a known preview bug). The\n",
-    "> `storedHeaders` form (below) sends `x-apikey` directly and works reliably.\n",
-    "\n",
     "[MCP Server knowledge source (Python)](https://learn.microsoft.com/en-us/azure/search/agentic-knowledge-source-overview)\n"
    ]
   },
@@ -433,8 +427,7 @@
     "        description=\"Web IQ -- Microsoft Grounding MCP server (web, news, videos, browse).\",\n",
     "        mcp_server_parameters=McpServerKnowledgeSourceParameters(\n",
     "            server_url=WEB_IQ_MCP_SERVER_URL,\n",
-    "            # storedHeaders auth (x-apikey). NOT foundryConnection -- that variant\n",
-    "            # currently 502s (\"Could not resolve the tool manifest\") on KB Retrieve.\n",
+    "            # storedHeaders auth: send the Web IQ x-apikey directly.\n",
     "            authentication=McpServerStoredHeadersAuthentication(\n",
     "                stored_headers_parameters=McpServerStoredHeadersParameters(\n",
     "                    headers={\"x-apikey\": web_iq_key},\n",
@@ -518,6 +511,13 @@
     "    knowledge_sources=[KnowledgeSourceReference(name=n) for n in kb_sources],\n",
     "    retrieval_reasoning_effort=KnowledgeRetrievalLowReasoningEffort(),\n",
     "    output_mode=KnowledgeRetrievalOutputMode.ANSWER_SYNTHESIS,\n",
+    "    retrieval_instructions=(\n",
+    "        \"Route each subquery to the Microsoft IQ most likely to answer it: \"\n",
+    "        \"use Work IQ for internal, people, and collaboration context; use \"\n",
+    "        \"Fabric IQ for business facts from the ontology (fleet, routes, \"\n",
+    "        \"operations); use Web IQ for current events and public information. \"\n",
+    "        \"Use several sources when a question spans more than one.\"\n",
+    "    ),\n",
     "    answer_instructions=(\n",
     "        \"Answer using only the retrieved content across all sources. \"\n",
     "        \"When a question spans work content, the airline ontology, and the web, \"\n",
@@ -536,24 +536,14 @@
    "source": [
     "## 7 · Query the knowledge layer\n",
     "\n",
-    "This is the payoff. We run **four** retrievals against the one Foundry IQ KB:\n",
-    "\n",
-    "1. **7a Work IQ** — a work question, scoped to Work IQ.\n",
-    "2. **7b Fabric IQ** — an airline-ontology question, scoped to Fabric IQ.\n",
-    "3. **7c Web IQ** — a fresh-web question, scoped to Web IQ.\n",
-    "4. **7d Cross-source** — one question that **joins ≥2 IQs**, with all sources in\n",
-    "   scope.\n",
+    "That's the whole build. Querying is a single call: send a question to the\n",
+    "Knowledge Base and Foundry IQ plans subqueries, routes them across Work IQ,\n",
+    "Fabric IQ, and Web IQ (steered by the `retrieval_instructions` from §6),\n",
+    "reranks, and returns one cited answer.\n",
     "\n",
-    "Each cell prints the synthesized answer, the **reference count per source**, and\n",
-    "a sample extract — so you can *see* each Microsoft IQ grounding the answer.\n",
-    "\n",
-    "> **Per-user IQs need a caller token.** Both **Work IQ** and **Fabric IQ** are\n",
-    "> identity-scoped: each retrieve must carry a user token via\n",
-    "> `query_source_authorization` (audience `https://search.azure.com`). **Web IQ**\n",
-    "> authenticates with its own stored `x-apikey`, so it grounds with or without\n",
-    "> the token. The setup cell below mints the signed-in user's token with\n",
-    "> `DefaultAzureCredential`. See\n",
-    "> [Retrieve from a knowledge base (Python)](https://learn.microsoft.com/en-us/azure/search/agentic-retrieval-how-to-retrieve?pivots=python).\n"
+    "> Work IQ and Fabric IQ are per-user: pass the signed-in user's token via\n",
+    "> `query_source_authorization` (audience `https://search.azure.com`). Web IQ\n",
+    "> grounds with its own stored key.\n"
    ]
   },
   {
@@ -563,280 +553,36 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "import json as _json\n",
-    "\n",
     "from azure.identity import DefaultAzureCredential\n",
     "from azure.search.documents.knowledgebases import KnowledgeBaseRetrievalClient\n",
     "from azure.search.documents.knowledgebases.models import (\n",
-    "    FabricOntologyKnowledgeSourceParams,\n",
     "    KnowledgeBaseMessage,\n",
     "    KnowledgeBaseMessageTextContent,\n",
     "    KnowledgeBaseRetrievalRequest,\n",
-    "    McpServerKnowledgeSourceParams,\n",
-    "    WorkIQKnowledgeSourceParams,\n",
     ")\n",
     "\n",
-    "retrieval_client = KnowledgeBaseRetrievalClient(\n",
+    "kb_client = KnowledgeBaseRetrievalClient(\n",
     "    endpoint=SEARCH_ENDPOINT,\n",
-    "    credential=credential,\n",
     "    knowledge_base_name=KB_NAME,\n",
+    "    credential=DefaultAzureCredential(),\n",
     ")\n",
     "\n",
-    "\n",
-    "def user_query_authorization() -> Optional[str]:\n",
-    "    \"\"\"Mint the signed-in user's token for the per-user IQs (Work IQ, Fabric IQ).\"\"\"\n",
-    "    try:\n",
-    "        token = DefaultAzureCredential().get_token(\"https://search.azure.com/.default\").token\n",
-    "        print(\"query_source_authorization : acquired user token\")\n",
-    "        return token\n",
-    "    except Exception as exc:  # noqa: BLE001 -- best-effort; Web IQ still works\n",
-    "        print(f\"query_source_authorization : unavailable ({exc}); Work IQ + Fabric IQ will return no references\")\n",
-    "        return None\n",
-    "\n",
-    "\n",
-    "query_auth = user_query_authorization()\n",
-    "\n",
-    "\n",
-    "def ks_params_for(name: str, reranker_threshold: Optional[float] = None):\n",
-    "    \"\"\"Per-kind retrieve params for the federated IQ wired into the KB.\"\"\"\n",
-    "    common = dict(\n",
-    "        knowledge_source_name=name,\n",
-    "        include_references=True,\n",
-    "        include_reference_source_data=True,\n",
-    "    )\n",
-    "    if reranker_threshold is not None:\n",
-    "        common[\"reranker_threshold\"] = reranker_threshold\n",
-    "    if name == KS_WORK_IQ:\n",
-    "        return WorkIQKnowledgeSourceParams(**common)\n",
-    "    if name == KS_FABRIC_IQ:\n",
-    "        return FabricOntologyKnowledgeSourceParams(**common)\n",
-    "    if name == KS_WEB_IQ:\n",
-    "        return McpServerKnowledgeSourceParams(**common)\n",
-    "    return None\n",
-    "\n",
-    "\n",
-    "def retrieve(question: str, sources: list[str], *, reranker_threshold=None, max_runtime_seconds: int = 180):\n",
-    "    \"\"\"Run one KB retrieval, scoped to `sources` (a subset of the KB's IQs).\"\"\"\n",
-    "    params = [ks_params_for(n, reranker_threshold) for n in sources]\n",
-    "    request = KnowledgeBaseRetrievalRequest(\n",
-    "        messages=[\n",
-    "            KnowledgeBaseMessage(\n",
-    "                role=\"user\",\n",
-    "                content=[KnowledgeBaseMessageTextContent(text=question)],\n",
-    "            )\n",
-    "        ],\n",
-    "        knowledge_source_params=[p for p in params if p],\n",
-    "        include_activity=True,\n",
-    "        max_runtime_in_seconds=max_runtime_seconds,\n",
-    "    )\n",
-    "    # query_source_authorization auths the per-user IQs (Work IQ, Fabric IQ).\n",
-    "    return retrieval_client.retrieve(request, query_source_authorization=query_auth)\n",
-    "\n",
-    "\n",
-    "def answer_text(result) -> str:\n",
-    "    parts = []\n",
-    "    for message in (result.response or []):\n",
-    "        for content in (message.content or []):\n",
-    "            text = getattr(content, \"text\", None)\n",
-    "            if text:\n",
-    "                parts.append(text)\n",
-    "    return \"\\n\\n\".join(parts)\n",
-    "\n",
-    "\n",
-    "def describe_reference(ref) -> str:\n",
-    "    \"\"\"Pull a human-readable snippet out of a reference, per IQ type.\"\"\"\n",
-    "    rtype = getattr(ref, \"type\", None)\n",
-    "    src = ref.source_data or {}\n",
-    "    if not isinstance(src, dict):\n",
-    "        return str(src)[:240]\n",
-    "    if rtype == \"fabricOntology\":                              # Fabric IQ\n",
-    "        bits = []\n",
-    "        if src.get(\"fabricAnswer\"):\n",
-    "            bits.append(str(src[\"fabricAnswer\"]))\n",
-    "        if src.get(\"fabricRawData\"):\n",
-    "            bits.append(\"data: \" + str(src[\"fabricRawData\"])[:160])\n",
-    "        return \"  |  \".join(bits) or _json.dumps(src)[:240]\n",
-    "    if rtype == \"workIQ\":                                      # Work IQ\n",
-    "        texts = [e.get(\"text\") for e in (src.get(\"extracts\") or []) if e.get(\"text\")]\n",
-    "        more = [a.get(\"seeMoreWebUrl\") for a in (src.get(\"attributions\") or []) if a.get(\"seeMoreWebUrl\")]\n",
-    "        out = (\" \".join(texts) or src.get(\"content\") or src.get(\"text\") or \"\")[:200]\n",
-    "        if more:\n",
-    "            out += f\"  (see more: {more[0]})\"\n",
-    "        return out or _json.dumps(src)[:240]\n",
-    "    return (src.get(\"title\") or \"\") + \" \" + (src.get(\"content\") or src.get(\"text\") or _json.dumps(src))[:200]\n",
-    "\n",
-    "\n",
-    "def refs_by_type(result) -> dict:\n",
-    "    counts: dict = {}\n",
-    "    for r in (result.references or []):\n",
-    "        t = getattr(r, \"type\", None)\n",
-    "        counts[t] = counts.get(t, 0) + 1\n",
-    "    return counts\n",
-    "\n",
-    "\n",
-    "def report(label: str, result, *, max_answer: int = 600, max_refs: int = 3) -> None:\n",
-    "    if result is None:\n",
-    "        print(f\"=== {label} ===\\n[skipped] source not created this run\\n\")\n",
-    "        return\n",
-    "    refs = result.references or []\n",
-    "    print(f\"=== {label} ===\")\n",
-    "    print(f\"references: {len(refs)}   by_type: {refs_by_type(result)}\")\n",
-    "    ans = answer_text(result).strip()\n",
-    "    if ans:\n",
-    "        print(f\"\\nANSWER\\n------\\n{ans[:max_answer]}{'...' if len(ans) > max_answer else ''}\")\n",
-    "    for ref in refs[:max_refs]:\n",
-    "        snippet = describe_reference(ref).strip().replace(\"\\n\", \" \")\n",
-    "        print(f\"  [ref_id:{ref.id}] type={getattr(ref, 'type', None)!r} :: {snippet[:200]}\")\n",
-    "    print()\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "fab74957",
-   "metadata": {},
-   "source": [
-    "### 7a · Work IQ — a work question\n",
-    "\n",
-    "Scope the retrieve to Work IQ alone and ask about internal work content. A\n",
-    "non-empty `references` list (type `workIQ`) proves Work IQ grounded the answer.\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "6d2442f5",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "work_question = (\n",
-    "    \"Summarize what we've discussed internally about transatlantic route \"\n",
-    "    \"planning, long-haul fleet decisions, or premium-cabin strategy.\"\n",
-    ")\n",
-    "\n",
-    "if KS_WORK_IQ in kb_sources:\n",
-    "    res_work = retrieve(work_question, [KS_WORK_IQ])\n",
-    "else:\n",
-    "    res_work = None\n",
-    "    skip(\"7a Work IQ\", \"Work IQ source not created -- enable it in §3\")\n",
-    "\n",
-    "report(\"7a · Work IQ\", res_work)\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "097018a8",
-   "metadata": {},
-   "source": [
-    "### 7b · Fabric IQ — an airline-ontology question\n",
-    "\n",
-    "Scope to Fabric IQ and ask a **narrow, aggregate** question the ontology can\n",
-    "answer in business terms. We pass `reranker_threshold=0.0` so on-topic ontology\n",
-    "rows aren't filtered out.\n",
-    "\n",
-    "> **Keep Fabric questions narrow.** A broad ontology question (\"list every\n",
-    "> aircraft *and* its routes\") can trip a **\"data is too large to process in a\n",
-    "> single request\"** response — Fabric tries to pull a whole entity table.\n",
-    "> Aggregate or scoped questions (\"how many aircraft *by manufacturer*?\") return\n",
-    "> clean `fabricAnswer` + `fabricRawData`.\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "cbc2f1ae",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "fabric_question = (\n",
-    "    \"Using our airline ontology, how many aircraft are in the fleet, grouped by \"\n",
-    "    \"manufacturer?\"\n",
-    ")\n",
-    "\n",
-    "if KS_FABRIC_IQ in kb_sources:\n",
-    "    # reranker_threshold=0.0 keeps on-ontology rows that a higher bar would drop.\n",
-    "    # A narrow, aggregate question avoids the \"data too large\" Fabric error you\n",
-    "    # get when a broad query tries to pull an entire entity table at once.\n",
-    "    res_fabric = retrieve(fabric_question, [KS_FABRIC_IQ], reranker_threshold=0.0)\n",
-    "else:\n",
-    "    res_fabric = None\n",
-    "    skip(\"7b Fabric IQ\", \"Fabric IQ source not created -- enable it in §4\")\n",
-    "\n",
-    "report(\"7b · Fabric IQ\", res_fabric)\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "385afa20",
-   "metadata": {},
-   "source": [
-    "### 7c · Web IQ — a fresh-web question\n",
-    "\n",
-    "Scope to Web IQ and ask something only the live web can answer. References of\n",
-    "type `web` / `mcpServer` prove Web IQ grounded the answer. (Web IQ needs no user\n",
-    "token — it uses its stored `x-apikey`.)\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "7b432927",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "web_question = (\n",
-    "    \"What's the latest public news on long-haul, transatlantic aircraft and \"\n",
-    "    \"route announcements from major carriers?\"\n",
-    ")\n",
-    "\n",
-    "if KS_WEB_IQ in kb_sources:\n",
-    "    res_web = retrieve(web_question, [KS_WEB_IQ])\n",
-    "else:\n",
-    "    res_web = None\n",
-    "    skip(\"7c Web IQ\", \"Web IQ source not created -- set WEB_IQ_MCP_API_KEY in §5 (waitlist: https://aka.ms/webiq-waitlist)\")\n",
-    "\n",
-    "report(\"7c · Web IQ\", res_web)\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "3a47adcd",
-   "metadata": {},
-   "source": [
-    "### 7d · Cross-source — join the IQs\n",
-    "\n",
-    "Now the hero query. One question that **no single IQ can answer alone**: it pairs\n",
-    "our **fleet composition by manufacturer** (Fabric IQ) with the **latest public\n",
-    "news** on new long-haul aircraft and transatlantic routes (Web IQ), and folds in\n",
-    "any **internal context** (Work IQ). All sources are in scope; the planner fans\n",
-    "out, reranks, and synthesizes one cited answer. The activity trace shows how the\n",
-    "turn was decomposed.\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "fa73e00a",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "cross_question = (\n",
-    "    \"I'm prepping a transatlantic route-planning review. Using our airline \"\n",
-    "    \"ontology, what is our fleet composition by manufacturer? Then combine that \"\n",
-    "    \"with the latest public news on new long-haul aircraft and transatlantic \"\n",
-    "    \"routes — which manufacturers in the news do we already operate? Add \"\n",
-    "    \"anything we've discussed internally.\"\n",
+    "request = KnowledgeBaseRetrievalRequest(\n",
+    "    messages=[\n",
+    "        KnowledgeBaseMessage(\n",
+    "            role=\"user\",\n",
+    "            content=[KnowledgeBaseMessageTextContent(\n",
+    "                text=\"What should I know before our transatlantic route-planning review?\"\n",
+    "            )],\n",
+    "        ),\n",
+    "    ],\n",
     ")\n",
     "\n",
-    "# reranker_threshold=0.0 lets each IQ's top row survive the shared rerank so the\n",
-    "# answer is grounded across sources, not dominated by one.\n",
-    "res_cross = retrieve(cross_question, kb_sources, reranker_threshold=0.0)\n",
-    "report(\"7d · Cross-source\", res_cross, max_answer=900, max_refs=6)\n",
+    "# Work IQ + Fabric IQ are per-user; pass the caller's token. Web IQ uses its own key.\n",
+    "user_token = DefaultAzureCredential().get_token(\"https://search.azure.com/.default\").token\n",
+    "result = kb_client.retrieve(request, query_source_authorization=user_token)\n",
     "\n",
-    "# The planner activity trace: how the one turn was decomposed across the IQs.\n",
-    "activity = [a.as_dict() if hasattr(a, \"as_dict\") else dict(a) for a in (res_cross.activity or [])]\n",
-    "print(\"PLANNER ACTIVITY (truncated)\")\n",
-    "print(\"----------------------------\")\n",
-    "print(_json.dumps(activity, indent=2)[:2500])\n"
+    "print(result.response[0].content[0].text)\n"
    ]
   },
   {
@@ -938,8 +684,8 @@
     "## 10 · Next steps\n",
     "\n",
     "You built a **Microsoft IQ knowledge layer** — Work IQ, Fabric IQ, and Web IQ\n",
-    "federated by **Foundry IQ** into one Knowledge Base, proven to answer from each\n",
-    "IQ and across them. From here:\n",
+    "federated by **Foundry IQ** into one Knowledge Base you query with a single\n",
+    "call. From here:\n",
     "\n",
     "- **Tour every KS type.** The companion recipe\n",
     "  [Mastering Foundry IQ](mastering-foundry-iq) walks indexed, uploaded, and\n",