Improve all samples with cache-awareness, add 4 new samples, fix SDK versions, and prepare repo for public sharing#546
Conversation
…ions, and prepare repo for public sharing
|
@leestott is attempting to deploy a commit to the MSFT-AIP Team on Vercel. A member of the Team first needs to authorize it. |
There was a problem hiding this comment.
Pull request overview
This PR updates the repository’s samples to be more “cache-aware” (skip redundant model downloads and provide clearer progress UX), adds several new end-to-end samples (JS local CAG/RAG, Python agent framework, C# Whisper transcription), and tightens repo hygiene/version consistency in preparation for public sharing.
Changes:
- Added new JS offline CAG and offline RAG samples with web UIs + model init progress reporting.
- Added a new Python “agent-framework” sample (multi-agent orchestration + Flask SSE UI) and smoke tests.
- Updated multiple existing samples/notebooks/docs to use cache checks, clearer lifecycle steps, and pinned SDK versions (plus SUPPORT.md refresh).
Reviewed changes
Copilot reviewed 93 out of 93 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| samples/rag/rag_foundrylocal_demo.ipynb | Updates notebook to use Foundry Local C# SDK lifecycle + SDK-managed endpoint. |
| samples/rag/README.md | Documents SDK-based lifecycle and removes hardcoded endpoint/variant guidance. |
| samples/python/summarize/summarize.py | Adds cache-aware model selection/download UX for summarize CLI. |
| samples/python/summarize/requirements.txt | Bumps minimum foundry-local-sdk version. |
| samples/python/summarize/README.md | Adds feature notes for cache-awareness + UX improvements. |
| samples/python/hello-foundry-local/src/app.py | Adds cache-check + explicit lifecycle steps before streaming chat. |
| samples/python/hello-foundry-local/requirements.txt | Adds missing requirements file with SDK + OpenAI deps. |
| samples/python/hello-foundry-local/README.md | Adds cache-aware feature notes + clarifies run steps. |
| samples/python/functioncalling/fl_tools.ipynb | Adds explicit lifecycle (start/cache/download/load) before tool-calling demo. |
| samples/python/functioncalling/README.md | Fixes notebook link + adds prerequisites/features. |
| samples/python/agent-framework/tests/test_smoke.py | Adds smoke tests for imports, doc loading, env override, demo registry. |
| samples/python/agent-framework/src/app/web.py | Flask web UI + SSE endpoints for orchestrator + demos. |
| samples/python/agent-framework/src/app/tool_demo.py | Standalone tool-calling validation for direct + LLM-driven tools. |
| samples/python/agent-framework/src/app/orchestrator.py | Implements sequential/concurrent/hybrid orchestration as async generators. |
| samples/python/agent-framework/src/app/foundry_boot.py | Bootstrapper for Foundry Local endpoint/model selection + env override. |
| samples/python/agent-framework/src/app/documents.py | Loads/chunks local docs into retriever context. |
| samples/python/agent-framework/src/app/demos/weather_tools.py | Adds multi-tool weather demo. |
| samples/python/agent-framework/src/app/demos/sentiment_analyzer.py | Adds sentiment/emotion/key-phrase tools demo. |
| samples/python/agent-framework/src/app/demos/registry.py | Central demo registry for web UI listing/routing. |
| samples/python/agent-framework/src/app/demos/multi_agent_debate.py | Adds multi-agent debate demo. |
| samples/python/agent-framework/src/app/demos/math_agent.py | Adds math/tools demo (includes expression evaluation). |
| samples/python/agent-framework/src/app/demos/code_reviewer.py | Adds code review tools demo. |
| samples/python/agent-framework/src/app/demos/init.py | Exposes demos + registry helpers for import/registration. |
| samples/python/agent-framework/src/app/agents.py | Agent factories + shared tool functions. |
| samples/python/agent-framework/src/app/main.py | CLI entry (web/cli modes) + orchestrator runner. |
| samples/python/agent-framework/src/app/init.py | Defines package root. |
| samples/python/agent-framework/requirements.txt | Declares runtime dependencies for the new sample. |
| samples/python/agent-framework/pyproject.toml | Packaging metadata + deps + dev extras (pytest). |
| samples/python/agent-framework/data/orchestration_patterns.md | Sample docs for retriever context. |
| samples/python/agent-framework/data/foundry_local_overview.md | Sample docs for retriever context. |
| samples/python/agent-framework/data/agent_framework_guide.md | Sample docs for retriever context. |
| samples/python/agent-framework/README.md | Full sample documentation + quickstart + structure. |
| samples/python/agent-framework/.env.example | Environment template for model/docs/log level. |
| samples/js/web-server-example/app.js | Adds cache check + progress bar before downloading models. |
| samples/js/tool-calling-foundry-local/src/app.js | Adds cache check + progress bar before downloading models. |
| samples/js/native-chat-completions/app.js | Adds cache check + reusable progress bar for model download. |
| samples/js/local-rag/src/vectorStore.js | New SQLite-backed TF store with inverted index + caching. |
| samples/js/local-rag/src/server.js | New Express server with SSE status + chat + upload + ingestion. |
| samples/js/local-rag/src/prompts.js | System prompts for gas-field RAG agent (full + compact). |
| samples/js/local-rag/src/ingest.js | New ingestion script to chunk + index docs into SQLite. |
| samples/js/local-rag/src/config.js | Config for model, chunking, paths, and server settings. |
| samples/js/local-rag/src/chunker.js | Front-matter parsing + chunking + cosine similarity helpers. |
| samples/js/local-rag/src/chatEngine.js | Initializes SDK/model + retrieval + streaming/non-streaming responses. |
| samples/js/local-rag/package.json | New package manifest for local-rag sample. |
| samples/js/local-rag/docs/valve-inspection.md | Domain doc for RAG ingestion. |
| samples/js/local-rag/docs/pressure-testing.md | Domain doc for RAG ingestion. |
| samples/js/local-rag/docs/ppe-requirements.md | Domain doc for RAG ingestion. |
| samples/js/local-rag/docs/gas-leak-detection.md | Domain doc for RAG ingestion. |
| samples/js/local-rag/docs/emergency-shutdown.md | Domain doc for RAG ingestion. |
| samples/js/local-rag/README.md | New sample documentation (setup/ingest/architecture). |
| samples/js/local-cag/src/server.js | New Express server for CAG sample + init status SSE. |
| samples/js/local-cag/src/prompts.js | System prompts for gas-field CAG agent (full + compact). |
| samples/js/local-cag/src/modelSelector.js | Auto model selection based on RAM + caching preference. |
| samples/js/local-cag/src/context.js | Loads docs + keyword scoring + builds selected context per query. |
| samples/js/local-cag/src/config.js | Config for model selection, RAM budget, server, and context size. |
| samples/js/local-cag/src/chatEngine.js | Initializes SDK/model + injects preloaded context per query. |
| samples/js/local-cag/package.json | New package manifest for local-cag sample. |
| samples/js/local-cag/docs/valve-inspection.md | Domain doc for CAG startup context. |
| samples/js/local-cag/docs/pressure-testing.md | Domain doc for CAG startup context. |
| samples/js/local-cag/docs/ppe-requirements.md | Domain doc for CAG startup context. |
| samples/js/local-cag/docs/gas-leak-detection.md | Domain doc for CAG startup context. |
| samples/js/local-cag/docs/emergency-shutdown.md | Domain doc for CAG startup context. |
| samples/js/local-cag/README.md | New sample documentation (setup/architecture/config). |
| samples/js/langchain-integration-example/app.js | Adds cache check + progress bar before downloading models. |
| samples/js/electron-chat-application/package.json | Adds missing foundry-local-sdk dependency. |
| samples/js/copilot-sdk-foundry-local/src/tool-calling.ts | Pins SDK version + cache-aware model download. |
| samples/js/copilot-sdk-foundry-local/src/app.ts | Pins SDK version + cache-aware model download. |
| samples/js/copilot-sdk-foundry-local/package.json | Pins foundry-local-sdk version. |
| samples/js/chat-and-audio-foundry-local/package.json | Pins foundry-local-sdk version. |
| samples/js/audio-transcription-example/app.js | Adds cache check + progress bar before downloading models. |
| samples/cs/whisper-transcription/wwwroot/styles.css | New UI styling for Whisper transcription sample. |
| samples/cs/whisper-transcription/wwwroot/index.html | New drag/drop UI for uploading and transcribing audio. |
| samples/cs/whisper-transcription/wwwroot/app.js | Client-side upload/transcribe/copy + health polling. |
| samples/cs/whisper-transcription/nuget.config | Adds package source mapping for Foundry packages. |
| samples/cs/whisper-transcription/appsettings.json | Adds Foundry config (model alias, log level). |
| samples/cs/whisper-transcription/WhisperTranscription.csproj | New ASP.NET Core project for transcription service. |
| samples/cs/whisper-transcription/Services/TranscriptionService.cs | Implements streaming transcription via Foundry SDK audio client. |
| samples/cs/whisper-transcription/Services/FoundryOptions.cs | Options binding for model alias + logging. |
| samples/cs/whisper-transcription/Services/FoundryModelService.cs | Initializes Foundry manager + cache-aware download + load. |
| samples/cs/whisper-transcription/README.md | New sample documentation + endpoints + setup. |
| samples/cs/whisper-transcription/Program.cs | Minimal API endpoints + swagger + error middleware. |
| samples/cs/whisper-transcription/Middleware/ErrorHandlingMiddleware.cs | Centralized exception-to-JSON error handling. |
| samples/cs/whisper-transcription/Health/FoundryHealthCheck.cs | Health check that validates model availability. |
| samples/cs/GettingStarted/src/ToolCallingFoundryLocalWebServer/Program.cs | Adds explicit cache check + download progress bar. |
| samples/cs/GettingStarted/src/ToolCallingFoundryLocalSdk/Program.cs | Adds explicit cache check + download progress bar. |
| samples/cs/GettingStarted/src/ModelManagementExample/Program.cs | Adds explicit cache check + download progress bar. |
| samples/cs/GettingStarted/src/HelloFoundryLocalSdk/Program.cs | Adds explicit cache check + download progress bar. |
| samples/cs/GettingStarted/src/FoundryLocalWebServer/Program.cs | Adds explicit cache check + download progress bar. |
| samples/cs/GettingStarted/src/AudioTranscriptionExample/Program.cs | Adds explicit cache check + download progress bar. |
| SUPPORT.md | Replaces template placeholders with real support guidance. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
samples/cs/whisper-transcription/Services/FoundryModelService.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 93 out of 93 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
samples/cs/whisper-transcription/Services/FoundryModelService.cs
Outdated
Show resolved
Hide resolved
… claims - FoundryModelService.cs: add SemaphoreSlim for thread-safe InitializeAsync to prevent concurrent callers from double-initializing in ASP.NET - summarize/README.md: align docs with code (uses first cached model, not phi-4-mini default) - local-rag/README.md: replace 'TF-IDF' with 'term-frequency' throughout since the implementation uses raw term-frequency maps without IDF weighting
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 93 out of 93 changed files in this pull request and generated 9 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
samples/cs/whisper-transcription/Services/FoundryModelService.cs
Outdated
Show resolved
Hide resolved
samples/cs/whisper-transcription/Services/TranscriptionService.cs
Outdated
Show resolved
Hide resolved
…onToken, README accuracy
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 93 out of 93 changed files in this pull request and generated 7 comments.
Comments suppressed due to low confidence (1)
samples/python/agent-framework/src/app/web.py:177
api_demo_run()creates a new event loop but doesn't callasyncio.set_event_loop(loop)(and doesn't clear it). For consistency withapi_run()and to avoid libraries failing due to missing current event loop, set/clear the loop in atry/finallyaroundrun_until_complete.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 94 out of 94 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
samples/python/agent-framework/src/app/web.py:183
- SSE responses for
/api/demo/<demo_id>/runare returned with onlymimetype="text/event-stream". For consistent real-time streaming (especially behind proxies), add the usual SSE headers (Cache-Control: no-cache,Connection: keep-alive, and optionallyX-Accel-Buffering: no) to thisResponseas well.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 94 out of 94 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
samples/cs/whisper-transcription/Middleware/ErrorHandlingMiddleware.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 94 out of 94 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 94 out of 94 changed files in this pull request and generated 6 comments.
Comments suppressed due to low confidence (1)
samples/python/agent-framework/src/app/web.py:173
prompt = data.get(...).strip()will raise an exception if the JSON field is not a string, leading to a 500 instead of a 400. Add a type check/coercion before.strip()and return a validation error for non-string input.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 94 out of 94 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (7)
samples/js/local-rag/src/chatEngine.js:1
- The async generator can hang indefinitely if
completeStreamingChat(...)rejects, becausedoneis only set in.then(...)and the waiting loop relies ondoneto terminate. Setdone = trueand resolve any pending waiter in a.catch(...)or.finally(...), and consider capturing the error to rethrow after flushing buffered chunks so SSE clients don’t stall on failures.
samples/js/local-cag/src/chatEngine.js:1 - Same streaming failure-mode as the Local RAG engine: if
completeStreamingChat(...)throws/rejects, the generator can wait forever becausedoneis only set in.then(...). Add a.catch(...)/.finally(...)that setsdone = trueand releases the waiter, and propagate the error (e.g., by storing it and throwing after the loop) so callers can send an SSE error event.
samples/python/agent-framework/src/app/foundry_boot.py:1 - In the external-endpoint override path,
model_idis set toself.alias. If the remote OpenAI-compatible endpoint expects a model ID (not an alias), downstream clients (e.g.,OpenAIChatClient(model_id=conn.model_id)) can fail unexpectedly. A more robust contract is to allow specifyingFOUNDRY_MODEL_ID(or reuseMODEL_ALIASvsMODEL_IDexplicitly) and keepmodel_alias/model_iddistinct.
samples/rag/README.md:1 - The README snippet doesn’t handle the
GetModelAsync(...)not-found case; if it returns null, the next line will throw. Update the documentation snippet to use the same defensiveness as the notebook code (null-coalescing throw or explicit check) so copy/paste users don’t hit aNullReferenceException.
samples/python/agent-framework/src/app/web.py:1 tracebackis imported but never used in this module. Removing unused imports reduces lint noise and keeps the sample easier to maintain.
samples/python/agent-framework/src/app/tool_demo.py:1asynciois imported but not used anywhere in this file. Removing it avoids confusion about event loop usage in this demo.
samples/python/agent-framework/src/app/orchestrator.py:1logging(andre) are currently unused in this module (logis defined but never referenced). Removing unused imports/variables will keep the sample clean and reduce warnings for users running linters.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 94 out of 94 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (8)
samples/js/local-rag/src/chatEngine.js:1
- If completeStreamingChat rejects/throws, done is never set and notify() is never called, which can leave the async generator blocked indefinitely waiting for more data. Attach a rejection handler that (1) records the error, (2) sets done = true, (3) calls notify(), and then have the generator either yield an error event or rethrow after draining buffered chunks.
samples/js/local-cag/src/chatEngine.js:1 - Same streaming failure-mode as the RAG engine: if completeStreamingChat rejects, done never flips to true and the generator can hang forever. Add a .catch(...) path that marks completion and propagates the error (e.g., by storing it and throwing it from the generator after waking it).
samples/js/local-rag/src/chatEngine.js:1 - This assumes catalog.getModel(...) always returns a model object. If the alias is invalid or the SDK returns null/undefined, this will throw on this.model.alias with a less actionable error. Prefer an explicit null-check and throw a clear error that includes the requested alias (and optionally suggests listing available models).
samples/js/local-rag/src/server.js:1 - fs.writeFileSync (and mkdirSync) blocks the Node.js event loop, so large uploads or slow disks can stall all concurrent requests. Prefer the async fs.promises equivalents (await mkdir + writeFile) or stream to disk to keep the server responsive under load.
samples/rag/README.md:1 - GetModelAsync(...) can return null; the subsequent model.IsCachedAsync() would then throw a null-reference exception. Update the snippet to explicitly handle the null case (e.g., throw with a clear message) so the README code is copy/paste safe.
samples/python/summarize/summarize.py:1 - The PR description states summarize.py was changed to load by alias (load_model(cached_models[0].alias)), but the updated code loads by cached model ID. Either update the PR description to match the implementation, or switch the code back to alias-based loading if that’s the intended/required SDK pattern.
samples/python/agent-framework/src/app/web.py:1 - Using module-level mutable globals for _conn/_docs makes the app hard to reason about in multi-app/test scenarios and can lead to cross-test interference (e.g., calling create_app twice overwrites shared state). Store these on the Flask app instance (app.config / g) or close over them inside create_app so each app instance is isolated.
samples/python/agent-framework/src/app/demos/math_agent.py:1 - Even with builtins disabled and a restricted character set, eval still permits potentially expensive computations (e.g., very large exponentiation via , or huge integers) that can cause CPU/memory DoS when invoked via tool-calling. Consider explicitly rejecting '' and extremely long inputs (length/digit-count limits), or replace eval with an AST-based expression evaluator that only supports the intended operators with bounded complexity.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 94 out of 94 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (11)
samples/python/summarize/summarize.py:1
- download_model(args.model)
may cache a specific resolved variant, but the code setsmodel_name = model_info.id(from the catalog lookup) rather than using the actual downloaded variant’s id. If resolution differs,load_model(model_name)can fail or load the wrong variant. Prefer capturing the return value ofdownload_model(...)(or re-readinglist_cached_models()` after download) and loading by that returned cached variant id.
samples/python/hello-foundry-local/src/app.py:1 - After
download_model(alias), the code setsmodel_id = model_info.idinstead of using the id of the downloaded cached variant (similar to the summarize sample). If the download resolves to a different variant thanmodel_info.id,load_model(model_id)may not load what was downloaded. Prefer using the returned value fromdownload_model(alias)(if available) to setmodel_id, or re-query cached models and pick the cached variant id.
samples/python/hello-foundry-local/src/app.py:1 - After
download_model(alias), the code setsmodel_id = model_info.idinstead of using the id of the downloaded cached variant (similar to the summarize sample). If the download resolves to a different variant thanmodel_info.id,load_model(model_id)may not load what was downloaded. Prefer using the returned value fromdownload_model(alias)(if available) to setmodel_id, or re-query cached models and pick the cached variant id.
samples/python/agent-framework/src/app/foundry_boot.py:1 - The class docstring claims the bootstrapper “download → load”, but the implementation only constructs
FoundryLocalManager(self.alias)and readsget_model_info. If the SDK constructor doesn’t guarantee download+load semantics in all environments, this is misleading and can cause hard-to-debug runtime failures. Either (a) make the bootstrapper explicitly cache-check/download/load (mirroring the cache-aware flows used elsewhere in the PR), or (b) update the docstring/comments to accurately describe what’s guaranteed here.
samples/python/agent-framework/src/app/foundry_boot.py:1 - The class docstring claims the bootstrapper “download → load”, but the implementation only constructs
FoundryLocalManager(self.alias)and readsget_model_info. If the SDK constructor doesn’t guarantee download+load semantics in all environments, this is misleading and can cause hard-to-debug runtime failures. Either (a) make the bootstrapper explicitly cache-check/download/load (mirroring the cache-aware flows used elsewhere in the PR), or (b) update the docstring/comments to accurately describe what’s guaranteed here.
samples/python/agent-framework/src/app/foundry_boot.py:1 - The class docstring claims the bootstrapper “download → load”, but the implementation only constructs
FoundryLocalManager(self.alias)and readsget_model_info. If the SDK constructor doesn’t guarantee download+load semantics in all environments, this is misleading and can cause hard-to-debug runtime failures. Either (a) make the bootstrapper explicitly cache-check/download/load (mirroring the cache-aware flows used elsewhere in the PR), or (b) update the docstring/comments to accurately describe what’s guaranteed here.
samples/js/local-cag/src/server.js:1 - Unlike the local-rag server’s status SSE, the local-cag status SSE connections are never closed when initialization completes. If users refresh or open multiple tabs, these long-lived connections can accumulate unnecessarily. Consider ending and removing connected SSE clients once
state.stage === "ready"(or likewise on terminal error) and clearing the set to avoid resource/connection pressure.
samples/python/agent-framework/src/app/web.py:1 - The
tracebackimport is unused in this module. Removing it helps keep the sample minimal and avoids implying stack traces are intended to be surfaced.
samples/python/agent-framework/src/app/tool_demo.py:1 asyncio,Console, andconsolearen’t used in this module. Removing unused imports/variables reduces noise and avoids suggesting output is routed through Rich when it isn’t.
samples/python/agent-framework/src/app/tool_demo.py:1asyncio,Console, andconsolearen’t used in this module. Removing unused imports/variables reduces noise and avoids suggesting output is routed through Rich when it isn’t.
samples/python/agent-framework/src/app/tool_demo.py:1asyncio,Console, andconsolearen’t used in this module. Removing unused imports/variables reduces noise and avoids suggesting output is routed through Rich when it isn’t.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 94 out of 94 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const engine = new ChatEngine(); | ||
|
|
||
| // ── API: Chat (non-streaming) ── | ||
| app.post("/api/chat", async (req, res) => { | ||
| try { | ||
| const { message, history, compact } = req.body; | ||
| if (!message || typeof message !== "string") { | ||
| return res.status(400).json({ error: "message is required" }); | ||
| } | ||
|
|
||
| if (compact !== undefined) engine.setCompactMode(!!compact); | ||
|
|
||
| const result = await engine.query( | ||
| message, | ||
| Array.isArray(history) ? history : [] | ||
| ); | ||
| res.json(result); | ||
| } catch (err) { | ||
| console.error("[API] Error:", err.message); | ||
| res.status(500).json({ error: "Internal server error" }); | ||
| } | ||
| }); |
There was a problem hiding this comment.
/api/chat, /api/chat/stream, /api/upload, and /api/docs can be called before engine.init() completes (the server starts listening first). In that window engine.getStore() is still null and engine.query*() will fail because the model/chatClient aren’t initialized, causing 500s or crashes. Add a requireReady middleware (similar to the local-cag sample) that returns 503 while engineReady is false, and apply it to all routes that depend on the initialized engine/store.
| // Load the model into memory | ||
| this._emitStatus("loading", `Loading ${this.modelAlias} into memory...`); | ||
| await this.model.load(); | ||
|
|
||
| // Create the native chat client with performance settings pre-configured | ||
| this.chatClient = this.model.createChatClient(); | ||
| this.chatClient.settings.temperature = 0.1; // Low for deterministic, safety-critical responses | ||
| this._emitStatus("ready", `Model ready: ${this.modelAlias}`); | ||
|
|
||
| // Open the local vector store | ||
| this.store = new VectorStore(config.dbPath); | ||
| const count = this.store.count(); | ||
| this._emitStatus("ready", `Vector store ready: ${count} chunks indexed.`); | ||
|
|
There was a problem hiding this comment.
ChatEngine.init() emits status with phase: "ready" twice (once right after creating the chat client and again after opening the vector store). Because the server/UI treat phase === "ready" as “fully initialized” (and close the SSE stream / enable chat), the first emission can signal readiness before initialization is actually complete. Use a non-terminal phase for intermediate steps (e.g., model_ready, store_ready) and emit ready only once at the end of init().
| # Resolve alias to the actual model ID via the SDK's catalog API | ||
| model_info = manager.get_model_info(self.alias) | ||
| model_id = model_info.id if model_info else self.alias | ||
|
|
There was a problem hiding this comment.
FoundryLocalBootstrapper.bootstrap() silently falls back to using model_id=self.alias when get_model_info(self.alias) returns None. If the alias is wrong or missing from the catalog, this hides the configuration error and pushes the failure later into agent execution. Prefer failing fast here (e.g., raise a ValueError with a clear message) so misconfiguration is surfaced at startup.
Summary
This PR improves every existing sample across all languages (C#, JavaScript, Python, Rust) with cache-awareness and visual feedback, adds 4 brand-new samples, fixes SDK version inconsistencies across the repo, and addresses repo hygiene issues for public sharing readiness.
93 files changed — 67 new files, 26 modified files.
What's Changed
1. New Samples (4)
samples/js/local-cag/— Context-Augmented Generation (12 files)Offline CAG-powered support agent for gas field engineers. Pre-loads domain documents (valve inspections, PPE requirements, emergency shutdown procedures, etc.) directly into the context window — no vector database, no embeddings, no retrieval pipeline needed.
samples/js/local-rag/— Retrieval-Augmented Generation (11 files)Offline RAG-powered support agent using SQLite + term-frequency vectors for document retrieval. Demonstrates the full RAG pipeline running 100% locally.
npm run ingest)samples/python/agent-framework/— Microsoft Agent Framework Integration (24 files)Full-featured agent framework sample showing Foundry Local as the LLM backend for agentic AI workflows.
samples/cs/whisper-transcription/— ASP.NET Core Whisper Transcription (13 files)Production-quality audio transcription service using Foundry Local's Whisper model via WinML.
FoundryModelService,TranscriptionService,FoundryHealthCheck2. Cache-Awareness Improvements (All Existing Samples)
Every existing sample was updated to check the local model cache before attempting downloads. This provides:
✓ Model already cachedor⏳ Downloading...C# samples updated (6 files):
AudioTranscriptionExample/Program.csFoundryLocalWebServer/Program.csHelloFoundryLocalSdk/Program.csModelManagementExample/Program.csToolCallingFoundryLocalSdk/Program.csToolCallingFoundryLocalWebServer/Program.csJavaScript samples updated (7 files):
audio-transcription-example/app.jscopilot-sdk-foundry-local/src/app.tsandsrc/tool-calling.tslangchain-integration-example/app.jsnative-chat-completions/app.jstool-calling-foundry-local/src/app.jsweb-server-example/app.jsPython samples updated (4 files):
hello-foundry-local/src/app.pysummarize/summarize.pyfunctioncalling/fl_tools.ipynbfunctioncalling/README.mdNotebooks updated (1 file):
rag/rag_foundrylocal_demo.ipynb— significant rewrite with cache detection, clearer cell structure, and improved RAG pipeline3. SDK API Correctness Fixes (7 files)
Validated all samples against the latest public SDK APIs (JS SDK
sdk/js/src, Python SDKsdk_legacy/python, C# SDKsdk/cs/src) and fixed:js/local-cag/src/modelSelector.jsselectedVariant._modelInfomodel.variants/variant.modelInfo/model.isCachedjs/local-rag/src/chatEngine.jsprogress * 100yielded 0–10000 (SDK reports 0–100)Math.round(progress)for display,progress / 100for normalized valuepython/summarize/summarize.pyload_model(cached_models[0].id)inconsistent with alias patternload_model(cached_models[0].alias)python/agent-framework/foundry_boot.pystr(m)substring match for model ID resolutionmanager.get_model_info(alias).idpython/agent-framework/web.pydrain()buffered all SSE events before yielding__anext__()loop for real-time streamingcs/whisper-transcription/TranscriptionService.csCancellationToken.NonehardcodedCancellationTokenthrough method and into all async callscs/whisper-transcription/FoundryModelService.csprogress % 10 == 0unreliable for floatMath.Floor(progress / 10)threshold bucket approach4. Review Feedback Fixes — Round 2 (3 files)
cs/whisper-transcription/FoundryModelService.csInitializeAsync()not thread-safe — concurrent ASP.NET requests could double-initializeSemaphoreSlimwith double-check locking patternpython/summarize/README.mdphi-4-minibut code uses first cached modeljs/local-rag/README.md5. Review Feedback Fixes — Round 3 (7 files)
python/agent-framework/README.mdFLASK_PORTenv var that doesn't exist in code--port <number>CLI flag which matches__main__.pyjs/local-rag/package.json"tfidf"keyword misleading — implementation is term-frequency only"term-frequency"python/agent-framework/web.pyasyncio.new_event_loop()withoutset_event_loop()— breaks on Python 3.10+asyncio.set_event_loop(loop)after creation, clears infinallyblockcs/whisper-transcription/FoundryModelService.csEnsureModelReadyAsynclackedCancellationTokenCancellationToken ct = defaultparameter, threaded throughIsCachedAsync(ct),DownloadAsync(..., ct),LoadAsync(ct)cs/whisper-transcription/TranscriptionService.cscttoEnsureModelReadyAsyncctfromTranscribeAsyncjs/local-cag/src/config.jshosthardcoded to"127.0.0.1"despite README documentingHOSTenv varprocess.env.HOST || "127.0.0.1"js/local-rag/src/config.jsFOUNDRY_MODEL,PORT,HOSTenv vars documented but not readprocess.env.FOUNDRY_MODEL,parseInt(process.env.PORT, 10),process.env.HOSTwith sensible defaults6. SDK Version Fixes
samples/js/local-cag/package.json^0.9.0^0.5.1samples/js/local-rag/package.json^0.9.0^0.5.1samples/js/copilot-sdk-foundry-local/package.json"latest"^0.5.1samples/js/chat-and-audio-foundry-local/package.json"latest"^0.5.1samples/js/electron-chat-application/package.json^0.5.1foundry-local-sdknot listed despiteimportinmain.jssamples/python/summarize/requirements.txt>=0.3.1>=0.5.1samples/python/hello-foundry-local/requirements.txt>=0.5.17. Repo Hygiene
TODOandREPO MAINTAINER: INSERT INSTRUCTIONS HEREplaceholders) with actual content pointing to GitHub Issues, docs, and samples.Validation Performed
sdk/js/src,sdk_legacy/python,sdk/cs/srcapi_keyreferences are programmaticSDK Version Matrix (Current State)
Microsoft.AI.Foundry.LocalDirectory.Packages.propsMicrosoft.AI.Foundry.Local.WinMLDirectory.Packages.propsfoundry-local-sdkpackage.jsonfilesfoundry-local-sdkrequirements.txtfilesfoundry-local-sdksdk/rust/Notes
native-chat-completions,web-server-example,audio-transcription-example,langchain-integration-example) are intentionally single-file with nopackage.json— their READMEs instruct users tonpm installmanually.path = "../../../sdk/rust"which always resolves to the latest local SDK.functioncallingnotebook uses! pip install foundry-local-sdkwithout version pin — standard for notebooks.