Skip to content

fix: downmix multi-channel audio to mono for SFSpeechRecognizer#32

Open
chrishutchins wants to merge 1 commit intof:masterfrom
chrishutchins:fix/multichannel-speech-recognition
Open

fix: downmix multi-channel audio to mono for SFSpeechRecognizer#32
chrishutchins wants to merge 1 commit intof:masterfrom
chrishutchins:fix/multichannel-speech-recognition

Conversation

@chrishutchins
Copy link
Copy Markdown

Summary

  • Problem: SFSpeechRecognizer silently returns no results when receiving multi-channel audio buffers. USB audio interfaces like the RODECaster Pro II send 2-channel 48kHz audio, and the previous format: nil tap delivered these buffers unchanged to the recognition request. The waveform visualization works fine (it only reads channel 0), but the recognizer never fires its result callback.

  • Fix: When the input device has more than one channel, create a mono AVAudioFormat at the hardware sample rate and pass it to installTap. AVAudioEngine handles the downmix automatically. Single-channel devices (e.g. built-in mic) are unaffected.

Test plan

  • Test with a multi-channel USB audio interface (e.g. RODECaster Pro II, Focusrite Scarlett, etc.) — speech recognition should now produce results
  • Test with built-in microphone — no change in behavior
  • Test switching between built-in mic and USB interface mid-session — recognition should restart and work with either device

🤖 Generated with Claude Code

SFSpeechRecognizer silently returns no results when receiving
multi-channel audio buffers. USB audio interfaces like the RODECaster
Pro II send 2-channel 48kHz audio, and the previous `format: nil` tap
delivered these buffers unchanged to the recognition request.

Create a mono AVAudioFormat at the hardware sample rate when the device
has more than one channel and pass it to installTap, letting
AVAudioEngine handle the downmix automatically.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant