feat(voice): batch STT via Yapper (Phase 1) by dimakis · Pull Request #105 · dimakis/mitzo

dimakis · 2026-04-05T19:10:43Z

Summary

Audio capture (audio.ts): MediaRecorder wrapper with runtime format negotiation (WebM/Opus preferred, MP4 fallback for Safari), auto-stop timer, cancel support, and FormData helper for Yapper's /v1/transcribe endpoint
Voice hook (useVoice.ts): Yapper health polling (30s), browser mic permission flow with micBlocked state, recording state machine, and batch transcription via POST
MicButton component: push-to-talk via pointer events (hold to record, release to send, drag away to cancel), with recording pulse, transcribing spinner, and blocked states
ChatInput/ChatView wiring: optional voice prop on ChatInput, transcript inserted into textarea for user review before sending. Voice is purely opt-in — Mitzo works identically without Yapper

Frontend-only — server and v2 protocol are untouched. Follows the voice integration design doc.

Test plan

34 new tests across 4 test files (audio, useVoice, MicButton, ChatInput integration)
Full suite passes: 545/545
Manual: verify mic button appears when Yapper is running on LAN
Manual: hold-to-record → release → transcript appears in textarea
Manual: mic button hidden when Yapper is offline
Manual: Safari format fallback (MP4)

🤖 Generated with Claude Code

MediaRecorder wrapper with runtime mimeType negotiation (WebM/Opus preferred, MP4 fallback for Safari), auto-stop timer, cancel support, and FormData helper for Yapper transcription endpoint. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Yapper health polling (30s interval), browser mic permission flow, MediaRecorder capture, and POST /v1/transcribe for batch transcription. Graceful degradation when Yapper is unavailable or mic is blocked. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Hold-to-record button with pointer events. States: hidden (unavailable), idle, recording (red pulse), transcribing (spinner), mic-blocked. Cancel on pointer leave. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ChatInput accepts optional voice prop, renders MicButton in the input row. Transcript is inserted into the textarea on recording stop. ChatView owns useVoice hook and passes it down. Mic button CSS with recording pulse, transcribing spin, and blocked states. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Verifies mic button visibility, recording state, transcript insertion, mic-blocked display, and graceful absence when voice prop is omitted. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs(design): tts playback design doc (phase 2) Covers: useVoice TTS extension, text chunking with pipelining, AudioContext playback, VoiceSettings component, ChatView auto-speak on message_end, interruption rules, voice selection, and error handling. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs(design): address review feedback on tts design - AudioContext reuse: singleton with lazy creation and cleanup - AbortController on synthesize() for cancellable fetches - Track messageId instead of messages.length for TTS trigger - Simplify to sequential playback for MVP (no pipelining) - Lazy voice list fetch (on first TTS enable, not mount) - Dynamic default voice from /v1/voices response Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

dimakis and others added 5 commits April 5, 2026 20:01

feat(voice): add MicButton push-to-talk component

9ec6223

Hold-to-record button with pointer events. States: hidden (unavailable), idle, recording (red pulse), transcribing (spinner), mic-blocked. Cancel on pointer leave. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

test(voice): add ChatInput voice integration tests

807a2b2

Verifies mic button visibility, recording state, transcript insertion, mic-blocked display, and graceful absence when voice prop is omitted. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

dimakis mentioned this pull request Apr 5, 2026

docs(design): tts playback design doc (phase 2) #106

Merged

dimakis merged commit fd283be into main Apr 5, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(voice): batch STT via Yapper (Phase 1)#105

feat(voice): batch STT via Yapper (Phase 1)#105
dimakis merged 6 commits intomainfrom
feat/voice-stt-batch

dimakis commented Apr 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dimakis commented Apr 5, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant