From 5f5dbf3c62292b7fa07abb18e0fa3452200d1818 Mon Sep 17 00:00:00 2001 From: Farzad Date: Fri, 5 Jun 2026 16:31:13 -0500 Subject: [PATCH] Polish Microsoft IQ recipe: stored-header auth + minimal query - Document only the stored-header (x-apikey) approach for Web IQ - Set retrieval_instructions on the KB to steer source routing - Reduce the query section to one minimal retrieve() call (KB auto-routes) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- notebooks/microsoft-iq-in-foundry.ipynb | 318 +++--------------------- 1 file changed, 32 insertions(+), 286 deletions(-) diff --git a/notebooks/microsoft-iq-in-foundry.ipynb b/notebooks/microsoft-iq-in-foundry.ipynb index 1128f69..11b0856 100644 --- a/notebooks/microsoft-iq-in-foundry.ipynb +++ b/notebooks/microsoft-iq-in-foundry.ipynb @@ -391,12 +391,6 @@ "| **Auth model** | A **stored header** (`x-apikey`) carrying your Web IQ MCP key. |\n", "| **Env vars** | `WEB_IQ_MCP_API_KEY` (**secret** — read from the environment, never committed). |\n", "\n", - "> **Why stored headers and not a Foundry connection?** The MCP server supports\n", - "> both a `foundryConnection` auth variant and a `storedHeaders` variant. As of\n", - "> this preview, the `foundryConnection` form returns **HTTP 502 \"Could not\n", - "> resolve the tool manifest\"** on KB Retrieve (a known preview bug). The\n", - "> `storedHeaders` form (below) sends `x-apikey` directly and works reliably.\n", - "\n", "[MCP Server knowledge source (Python)](https://learn.microsoft.com/en-us/azure/search/agentic-knowledge-source-overview)\n" ] }, @@ -433,8 +427,7 @@ " description=\"Web IQ -- Microsoft Grounding MCP server (web, news, videos, browse).\",\n", " mcp_server_parameters=McpServerKnowledgeSourceParameters(\n", " server_url=WEB_IQ_MCP_SERVER_URL,\n", - " # storedHeaders auth (x-apikey). NOT foundryConnection -- that variant\n", - " # currently 502s (\"Could not resolve the tool manifest\") on KB Retrieve.\n", + " # storedHeaders auth: send the Web IQ x-apikey directly.\n", " authentication=McpServerStoredHeadersAuthentication(\n", " stored_headers_parameters=McpServerStoredHeadersParameters(\n", " headers={\"x-apikey\": web_iq_key},\n", @@ -518,6 +511,13 @@ " knowledge_sources=[KnowledgeSourceReference(name=n) for n in kb_sources],\n", " retrieval_reasoning_effort=KnowledgeRetrievalLowReasoningEffort(),\n", " output_mode=KnowledgeRetrievalOutputMode.ANSWER_SYNTHESIS,\n", + " retrieval_instructions=(\n", + " \"Route each subquery to the Microsoft IQ most likely to answer it: \"\n", + " \"use Work IQ for internal, people, and collaboration context; use \"\n", + " \"Fabric IQ for business facts from the ontology (fleet, routes, \"\n", + " \"operations); use Web IQ for current events and public information. \"\n", + " \"Use several sources when a question spans more than one.\"\n", + " ),\n", " answer_instructions=(\n", " \"Answer using only the retrieved content across all sources. \"\n", " \"When a question spans work content, the airline ontology, and the web, \"\n", @@ -536,24 +536,14 @@ "source": [ "## 7 · Query the knowledge layer\n", "\n", - "This is the payoff. We run **four** retrievals against the one Foundry IQ KB:\n", - "\n", - "1. **7a Work IQ** — a work question, scoped to Work IQ.\n", - "2. **7b Fabric IQ** — an airline-ontology question, scoped to Fabric IQ.\n", - "3. **7c Web IQ** — a fresh-web question, scoped to Web IQ.\n", - "4. **7d Cross-source** — one question that **joins ≥2 IQs**, with all sources in\n", - " scope.\n", + "That's the whole build. Querying is a single call: send a question to the\n", + "Knowledge Base and Foundry IQ plans subqueries, routes them across Work IQ,\n", + "Fabric IQ, and Web IQ (steered by the `retrieval_instructions` from §6),\n", + "reranks, and returns one cited answer.\n", "\n", - "Each cell prints the synthesized answer, the **reference count per source**, and\n", - "a sample extract — so you can *see* each Microsoft IQ grounding the answer.\n", - "\n", - "> **Per-user IQs need a caller token.** Both **Work IQ** and **Fabric IQ** are\n", - "> identity-scoped: each retrieve must carry a user token via\n", - "> `query_source_authorization` (audience `https://search.azure.com`). **Web IQ**\n", - "> authenticates with its own stored `x-apikey`, so it grounds with or without\n", - "> the token. The setup cell below mints the signed-in user's token with\n", - "> `DefaultAzureCredential`. See\n", - "> [Retrieve from a knowledge base (Python)](https://learn.microsoft.com/en-us/azure/search/agentic-retrieval-how-to-retrieve?pivots=python).\n" + "> Work IQ and Fabric IQ are per-user: pass the signed-in user's token via\n", + "> `query_source_authorization` (audience `https://search.azure.com`). Web IQ\n", + "> grounds with its own stored key.\n" ] }, { @@ -563,280 +553,36 @@ "metadata": {}, "outputs": [], "source": [ - "import json as _json\n", - "\n", "from azure.identity import DefaultAzureCredential\n", "from azure.search.documents.knowledgebases import KnowledgeBaseRetrievalClient\n", "from azure.search.documents.knowledgebases.models import (\n", - " FabricOntologyKnowledgeSourceParams,\n", " KnowledgeBaseMessage,\n", " KnowledgeBaseMessageTextContent,\n", " KnowledgeBaseRetrievalRequest,\n", - " McpServerKnowledgeSourceParams,\n", - " WorkIQKnowledgeSourceParams,\n", ")\n", "\n", - "retrieval_client = KnowledgeBaseRetrievalClient(\n", + "kb_client = KnowledgeBaseRetrievalClient(\n", " endpoint=SEARCH_ENDPOINT,\n", - " credential=credential,\n", " knowledge_base_name=KB_NAME,\n", + " credential=DefaultAzureCredential(),\n", ")\n", "\n", - "\n", - "def user_query_authorization() -> Optional[str]:\n", - " \"\"\"Mint the signed-in user's token for the per-user IQs (Work IQ, Fabric IQ).\"\"\"\n", - " try:\n", - " token = DefaultAzureCredential().get_token(\"https://search.azure.com/.default\").token\n", - " print(\"query_source_authorization : acquired user token\")\n", - " return token\n", - " except Exception as exc: # noqa: BLE001 -- best-effort; Web IQ still works\n", - " print(f\"query_source_authorization : unavailable ({exc}); Work IQ + Fabric IQ will return no references\")\n", - " return None\n", - "\n", - "\n", - "query_auth = user_query_authorization()\n", - "\n", - "\n", - "def ks_params_for(name: str, reranker_threshold: Optional[float] = None):\n", - " \"\"\"Per-kind retrieve params for the federated IQ wired into the KB.\"\"\"\n", - " common = dict(\n", - " knowledge_source_name=name,\n", - " include_references=True,\n", - " include_reference_source_data=True,\n", - " )\n", - " if reranker_threshold is not None:\n", - " common[\"reranker_threshold\"] = reranker_threshold\n", - " if name == KS_WORK_IQ:\n", - " return WorkIQKnowledgeSourceParams(**common)\n", - " if name == KS_FABRIC_IQ:\n", - " return FabricOntologyKnowledgeSourceParams(**common)\n", - " if name == KS_WEB_IQ:\n", - " return McpServerKnowledgeSourceParams(**common)\n", - " return None\n", - "\n", - "\n", - "def retrieve(question: str, sources: list[str], *, reranker_threshold=None, max_runtime_seconds: int = 180):\n", - " \"\"\"Run one KB retrieval, scoped to `sources` (a subset of the KB's IQs).\"\"\"\n", - " params = [ks_params_for(n, reranker_threshold) for n in sources]\n", - " request = KnowledgeBaseRetrievalRequest(\n", - " messages=[\n", - " KnowledgeBaseMessage(\n", - " role=\"user\",\n", - " content=[KnowledgeBaseMessageTextContent(text=question)],\n", - " )\n", - " ],\n", - " knowledge_source_params=[p for p in params if p],\n", - " include_activity=True,\n", - " max_runtime_in_seconds=max_runtime_seconds,\n", - " )\n", - " # query_source_authorization auths the per-user IQs (Work IQ, Fabric IQ).\n", - " return retrieval_client.retrieve(request, query_source_authorization=query_auth)\n", - "\n", - "\n", - "def answer_text(result) -> str:\n", - " parts = []\n", - " for message in (result.response or []):\n", - " for content in (message.content or []):\n", - " text = getattr(content, \"text\", None)\n", - " if text:\n", - " parts.append(text)\n", - " return \"\\n\\n\".join(parts)\n", - "\n", - "\n", - "def describe_reference(ref) -> str:\n", - " \"\"\"Pull a human-readable snippet out of a reference, per IQ type.\"\"\"\n", - " rtype = getattr(ref, \"type\", None)\n", - " src = ref.source_data or {}\n", - " if not isinstance(src, dict):\n", - " return str(src)[:240]\n", - " if rtype == \"fabricOntology\": # Fabric IQ\n", - " bits = []\n", - " if src.get(\"fabricAnswer\"):\n", - " bits.append(str(src[\"fabricAnswer\"]))\n", - " if src.get(\"fabricRawData\"):\n", - " bits.append(\"data: \" + str(src[\"fabricRawData\"])[:160])\n", - " return \" | \".join(bits) or _json.dumps(src)[:240]\n", - " if rtype == \"workIQ\": # Work IQ\n", - " texts = [e.get(\"text\") for e in (src.get(\"extracts\") or []) if e.get(\"text\")]\n", - " more = [a.get(\"seeMoreWebUrl\") for a in (src.get(\"attributions\") or []) if a.get(\"seeMoreWebUrl\")]\n", - " out = (\" \".join(texts) or src.get(\"content\") or src.get(\"text\") or \"\")[:200]\n", - " if more:\n", - " out += f\" (see more: {more[0]})\"\n", - " return out or _json.dumps(src)[:240]\n", - " return (src.get(\"title\") or \"\") + \" \" + (src.get(\"content\") or src.get(\"text\") or _json.dumps(src))[:200]\n", - "\n", - "\n", - "def refs_by_type(result) -> dict:\n", - " counts: dict = {}\n", - " for r in (result.references or []):\n", - " t = getattr(r, \"type\", None)\n", - " counts[t] = counts.get(t, 0) + 1\n", - " return counts\n", - "\n", - "\n", - "def report(label: str, result, *, max_answer: int = 600, max_refs: int = 3) -> None:\n", - " if result is None:\n", - " print(f\"=== {label} ===\\n[skipped] source not created this run\\n\")\n", - " return\n", - " refs = result.references or []\n", - " print(f\"=== {label} ===\")\n", - " print(f\"references: {len(refs)} by_type: {refs_by_type(result)}\")\n", - " ans = answer_text(result).strip()\n", - " if ans:\n", - " print(f\"\\nANSWER\\n------\\n{ans[:max_answer]}{'...' if len(ans) > max_answer else ''}\")\n", - " for ref in refs[:max_refs]:\n", - " snippet = describe_reference(ref).strip().replace(\"\\n\", \" \")\n", - " print(f\" [ref_id:{ref.id}] type={getattr(ref, 'type', None)!r} :: {snippet[:200]}\")\n", - " print()\n" - ] - }, - { - "cell_type": "markdown", - "id": "fab74957", - "metadata": {}, - "source": [ - "### 7a · Work IQ — a work question\n", - "\n", - "Scope the retrieve to Work IQ alone and ask about internal work content. A\n", - "non-empty `references` list (type `workIQ`) proves Work IQ grounded the answer.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6d2442f5", - "metadata": {}, - "outputs": [], - "source": [ - "work_question = (\n", - " \"Summarize what we've discussed internally about transatlantic route \"\n", - " \"planning, long-haul fleet decisions, or premium-cabin strategy.\"\n", - ")\n", - "\n", - "if KS_WORK_IQ in kb_sources:\n", - " res_work = retrieve(work_question, [KS_WORK_IQ])\n", - "else:\n", - " res_work = None\n", - " skip(\"7a Work IQ\", \"Work IQ source not created -- enable it in §3\")\n", - "\n", - "report(\"7a · Work IQ\", res_work)\n" - ] - }, - { - "cell_type": "markdown", - "id": "097018a8", - "metadata": {}, - "source": [ - "### 7b · Fabric IQ — an airline-ontology question\n", - "\n", - "Scope to Fabric IQ and ask a **narrow, aggregate** question the ontology can\n", - "answer in business terms. We pass `reranker_threshold=0.0` so on-topic ontology\n", - "rows aren't filtered out.\n", - "\n", - "> **Keep Fabric questions narrow.** A broad ontology question (\"list every\n", - "> aircraft *and* its routes\") can trip a **\"data is too large to process in a\n", - "> single request\"** response — Fabric tries to pull a whole entity table.\n", - "> Aggregate or scoped questions (\"how many aircraft *by manufacturer*?\") return\n", - "> clean `fabricAnswer` + `fabricRawData`.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "cbc2f1ae", - "metadata": {}, - "outputs": [], - "source": [ - "fabric_question = (\n", - " \"Using our airline ontology, how many aircraft are in the fleet, grouped by \"\n", - " \"manufacturer?\"\n", - ")\n", - "\n", - "if KS_FABRIC_IQ in kb_sources:\n", - " # reranker_threshold=0.0 keeps on-ontology rows that a higher bar would drop.\n", - " # A narrow, aggregate question avoids the \"data too large\" Fabric error you\n", - " # get when a broad query tries to pull an entire entity table at once.\n", - " res_fabric = retrieve(fabric_question, [KS_FABRIC_IQ], reranker_threshold=0.0)\n", - "else:\n", - " res_fabric = None\n", - " skip(\"7b Fabric IQ\", \"Fabric IQ source not created -- enable it in §4\")\n", - "\n", - "report(\"7b · Fabric IQ\", res_fabric)\n" - ] - }, - { - "cell_type": "markdown", - "id": "385afa20", - "metadata": {}, - "source": [ - "### 7c · Web IQ — a fresh-web question\n", - "\n", - "Scope to Web IQ and ask something only the live web can answer. References of\n", - "type `web` / `mcpServer` prove Web IQ grounded the answer. (Web IQ needs no user\n", - "token — it uses its stored `x-apikey`.)\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7b432927", - "metadata": {}, - "outputs": [], - "source": [ - "web_question = (\n", - " \"What's the latest public news on long-haul, transatlantic aircraft and \"\n", - " \"route announcements from major carriers?\"\n", - ")\n", - "\n", - "if KS_WEB_IQ in kb_sources:\n", - " res_web = retrieve(web_question, [KS_WEB_IQ])\n", - "else:\n", - " res_web = None\n", - " skip(\"7c Web IQ\", \"Web IQ source not created -- set WEB_IQ_MCP_API_KEY in §5 (waitlist: https://aka.ms/webiq-waitlist)\")\n", - "\n", - "report(\"7c · Web IQ\", res_web)\n" - ] - }, - { - "cell_type": "markdown", - "id": "3a47adcd", - "metadata": {}, - "source": [ - "### 7d · Cross-source — join the IQs\n", - "\n", - "Now the hero query. One question that **no single IQ can answer alone**: it pairs\n", - "our **fleet composition by manufacturer** (Fabric IQ) with the **latest public\n", - "news** on new long-haul aircraft and transatlantic routes (Web IQ), and folds in\n", - "any **internal context** (Work IQ). All sources are in scope; the planner fans\n", - "out, reranks, and synthesizes one cited answer. The activity trace shows how the\n", - "turn was decomposed.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "fa73e00a", - "metadata": {}, - "outputs": [], - "source": [ - "cross_question = (\n", - " \"I'm prepping a transatlantic route-planning review. Using our airline \"\n", - " \"ontology, what is our fleet composition by manufacturer? Then combine that \"\n", - " \"with the latest public news on new long-haul aircraft and transatlantic \"\n", - " \"routes — which manufacturers in the news do we already operate? Add \"\n", - " \"anything we've discussed internally.\"\n", + "request = KnowledgeBaseRetrievalRequest(\n", + " messages=[\n", + " KnowledgeBaseMessage(\n", + " role=\"user\",\n", + " content=[KnowledgeBaseMessageTextContent(\n", + " text=\"What should I know before our transatlantic route-planning review?\"\n", + " )],\n", + " ),\n", + " ],\n", ")\n", "\n", - "# reranker_threshold=0.0 lets each IQ's top row survive the shared rerank so the\n", - "# answer is grounded across sources, not dominated by one.\n", - "res_cross = retrieve(cross_question, kb_sources, reranker_threshold=0.0)\n", - "report(\"7d · Cross-source\", res_cross, max_answer=900, max_refs=6)\n", + "# Work IQ + Fabric IQ are per-user; pass the caller's token. Web IQ uses its own key.\n", + "user_token = DefaultAzureCredential().get_token(\"https://search.azure.com/.default\").token\n", + "result = kb_client.retrieve(request, query_source_authorization=user_token)\n", "\n", - "# The planner activity trace: how the one turn was decomposed across the IQs.\n", - "activity = [a.as_dict() if hasattr(a, \"as_dict\") else dict(a) for a in (res_cross.activity or [])]\n", - "print(\"PLANNER ACTIVITY (truncated)\")\n", - "print(\"----------------------------\")\n", - "print(_json.dumps(activity, indent=2)[:2500])\n" + "print(result.response[0].content[0].text)\n" ] }, { @@ -938,8 +684,8 @@ "## 10 · Next steps\n", "\n", "You built a **Microsoft IQ knowledge layer** — Work IQ, Fabric IQ, and Web IQ\n", - "federated by **Foundry IQ** into one Knowledge Base, proven to answer from each\n", - "IQ and across them. From here:\n", + "federated by **Foundry IQ** into one Knowledge Base you query with a single\n", + "call. From here:\n", "\n", "- **Tour every KS type.** The companion recipe\n", " [Mastering Foundry IQ](mastering-foundry-iq) walks indexed, uploaded, and\n",