A runnable sample that wires the browser microphone to Agent voice mode in Microsoft Foundry. A small FastAPI broker holds the credentials, opens a WebSocket to the Voice Live realtime endpoint, and binds the session to a hosted Foundry agent. The agent answers travel questions using tool calls and replies in natural speech.
A three-part workshop in labs/ walks you from a basic voice loop to a fully bound hosted agent.
- A sample. Clone it, fill in
.env, runscripts/start-local.ps1, and talk to the agent in your browser. - A workshop. Three progressive labs under
labs/teach the pattern step by step. - A reference. The exact Voice Live URL contract that the Foundry portal uses is encoded in
voicelive/server/voicelive_session.pyand probed byscripts/test-session.ps1. Use it as a regression test when the API changes.
It is not a production library. The broker is intentionally small so the pattern is easy to fork.
voicelive/serveris a FastAPI broker that holds credentials, builds the upstream WebSocket URL, and relays audio frames in both directions.voicelive/clientis a small static page that captures microphone audio, ships PCM16 frames over WebSocket, and renders transcripts with markdown.voicelive/config/session.jsonis the first frame the browser sends after the socket opens. It pins the voice, the noise reduction, the echo cancellation, and the semantic VAD.agent/contains the Foundry agent definition, the system prompt, and three sample tools (weather, flight status, hotel details).infra/contains a Bicep template that provisions the Foundry resource, the project, and the model deployment.labs/is the three-part workshop.docs/blog/contains a stand-alone HTML blog post that summarises the architecture for an external audience.
- Copy
.env.sampleto.envand fill in the values from your Foundry resource. - Create a Python virtual environment and install
requirements.txt. - Run
scripts/start-local.ps1to launch the broker onhttp://127.0.0.1:8000. - Open the URL in a browser, allow microphone access, and start talking.
- Run
scripts/test-session.ps1for a non-interactive smoke test.
If you would rather build up to the full sample one step at a time, work through the labs in order.
- labs/lab1-basic-voice.md runs the broker against a plain model with no agent binding.
- labs/lab2-add-tools.md adds three Python tools and shows how Voice Live invokes them.
- labs/lab3-hosted-agent.md creates the hosted agent in the Foundry portal and wires it into the broker.
Each lab is self-contained and ends with a working checkpoint, so you can stop after any lab and still have something that runs.
The browser never sees an Azure key or token. The broker performs the upstream handshake with either a bearer token from DefaultAzureCredential or an API key, then pipes frames in both directions.
The Voice Live WebSocket URL is built as follows.
wss://<resource>.cognitiveservices.azure.com/voice-live/realtime
?api-version=2025-10-01
&model=<agent display name>
&agent-project-name=<project name>
&agent-id=<agent id>
&agent-access-token=<aad token for cognitive services>
&authorization=Bearer%20<same aad token>
The model query parameter must match the agent display name. The authorization value must be URL-encoded.
- docs/architecture.md explains the request flow and the auth model.
- docs/demo-flow.md is a script you can read aloud during a demo.
- docs/deployment.md covers provisioning, the Foundry portal agent flow, and Container Apps hosting.
- docs/troubleshooting.md lists the common failures and how to fix them.
- docs/blog/voice-live-hosted-agent.html is the long-form write-up of the pattern.
The end-to-end deployment is documented in docs/deployment.md. The summary is as follows.
- Provision the Foundry resource, project, and model with the Bicep template in
infra/. - Create the agent in the Foundry portal. The SDK path in
scripts/deploy-agent.ps1produces an Assistants-style agent that lacks themicrosoft.voice-live.enabledmetadata required by the working URL shape, so the portal is currently the only reliable route. - Assign the
Cognitive Services Userrole to the identity that will run the broker. - Set the broker environment variables and run it locally, or deploy it to Azure Container Apps with a managed identity.
See CONTRIBUTING.md for the contributor guide and CODE_OF_CONDUCT.md for the community standards. Security issues should be reported privately as described in SECURITY.md. Questions and help requests are covered in SUPPORT.md.
This project is released under the MIT License. See LICENSE for the full text.
This project may contain trademarks or logos for projects, products, or services. Authorised use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark and Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos is subject to those third parties' policies.