Skip to content

Conversation

@lukeocodes
Copy link
Contributor

This pull request updates and improves several real-time audio transcription and streaming examples, primarily by enhancing WebSocket usage, updating code to use newer client APIs, and reorganizing example files for clarity. The changes include replacing older synchronous examples with more complete and production-ready versions, introducing a new advanced Listen V2 example, and updating method calls to match the latest client library conventions.

Transcription Examples Modernization

  • Added a new example examples/13-transcription-live-websocket.py that streams audio chunks from a file in real time, simulating microphone input, and uses updated event handling and media transmission methods. The example also demonstrates both synchronous and asynchronous usage patterns.
  • Removed the older, less complete WebSocket transcription example examples/07-transcription-live-websocket.py.

Advanced Listen V2 Example

  • Added examples/14-transcription-live-websocket-v2.py, demonstrating advanced conversational speech recognition with Listen V2, including contextual turn detection and strict requirements for audio format and streaming.
  • Removed the previous Listen V2 example examples/26-transcription-live-websocket-v2.py, which lacked audio streaming and contextual turn handling.

API Method Updates Across Streaming Examples

  • Updated method calls in the text-to-speech streaming example (examples/21-text-to-speech-streaming.py, previously examples/11-text-to-speech-streaming.py) to use the latest client API conventions (send_text, send_flush, send_close) for both synchronous and asynchronous usage. [1] [2]
  • Updated method calls in the voice agent example (examples/30-voice-agent.py, previously examples/09-voice-agent.py) to use new API methods (send_settings, send_media) and updated async usage accordingly. [1] [2]

…-ready patterns

Reorganize all example files with a more scalable numbering system organized by feature area:
- 01-09: Authentication
- 10-19: Transcription (Listen)
- 20-29: Text-to-Speech (Speak)
- 30-39: Voice Agent
- 40-49: Text Intelligence (Read)
- 50-59: Management API
- 60-69: On-Premises
- 70-79: Configuration & Advanced

Changes:
- Renamed all examples to follow new numbering scheme
- Updated WebSocket examples (13, 14) with production-ready streaming patterns
  - Removed artificial delays that don't reflect real usage
  - Simplified to straightforward file streaming approach
  - Added clear async implementation examples in comments
- Updated README.md to reflect new organization

The new numbering makes it easier to add future examples without renumbering existing ones.
Remove trailing whitespace and format code consistently in WebSocket streaming examples.
Update all WebSocket examples to use the correct method names:

Listen V1/V2:
- send_media() instead of send_listen_v_1_media() or send_listen_v_2_media()

Speak V1:
- send_text() instead of send_speak_v_1_text()
- send_flush() instead of send_speak_v_1_flush()
- send_close() instead of send_speak_v_1_close()

Agent V1:
- send_settings() instead of send_agent_v_1_settings()
- send_media() instead of send_agent_v_1_media()

Updated in examples:
- 13-transcription-live-websocket.py
- 14-transcription-live-websocket-v2.py
- 21-text-to-speech-streaming.py
- 30-voice-agent.py
…xamples

- Add chunk delay calculation to simulate microphone audio streaming
- Refactor audio sending into background thread functions
- Align v2 example chunking behavior with v1 example
- Improve async examples with proper streaming delays
@lukeocodes lukeocodes requested a review from Copilot January 15, 2026 21:52
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors example files to align with the new V6 SDK architecture by updating method names, reorganizing file numbering to follow a structured scheme (multiples of 10 by feature area), and replacing incomplete examples with production-ready versions that demonstrate real-time audio streaming patterns.

Changes:

  • Updated WebSocket method calls across examples to use new V6 API conventions (send_text, send_media, send_settings, etc.)
  • Reorganized example file numbering to group features in ranges (01-09 for Auth, 10-19 for Transcription, etc.)
  • Replaced basic WebSocket examples with comprehensive versions that include actual audio streaming implementation

Reviewed changes

Copilot reviewed 7 out of 23 changed files in this pull request and generated no comments.

Show a summary per file
File Description
examples/README.md Updated file numbering scheme and reorganized examples by feature area
examples/30-voice-agent.py Updated method calls to use new V6 API (send_settings, send_media)
examples/26-transcription-live-websocket-v2.py Removed incomplete Listen V2 example lacking audio streaming
examples/21-text-to-speech-streaming.py Updated TTS WebSocket methods to V6 API (send_text, send_flush, send_close)
examples/14-transcription-live-websocket-v2.py Added complete Listen V2 example with real-time audio streaming and turn detection
examples/13-transcription-live-websocket.py Added complete Listen V1 example with audio streaming and finalization
examples/07-transcription-live-websocket.py Removed incomplete Listen V1 example

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants