docs(design): voice integration with Yapper#104
Merged
Conversation
dimakis
commented
Apr 5, 2026
Owner
Author
dimakis
left a comment
There was a problem hiding this comment.
Design review — looks solid. A few gaps to address before or during implementation:
Browser permissions
navigator.mediaDevices.getUserMediarequires explicit user permission. Need a permission flow: prompt on first mic tap, handle denial gracefully (show "mic blocked" state, not just hidden button).
Safari MediaRecorder compatibility
- Safari's
MediaRecordersupport forvideo/webm/audio/webm;codecs=opusis inconsistent. iOS Safari may requireaudio/mp4fallback. Theaudio.tsmodule should negotiate mimeType at runtime (MediaRecorder.isTypeSupported()), and Yapper's format negotiation frame needs to handle whatever format the browser actually produces.
Yapper model readiness (resolved)
- Open question #5 is addressed by dimakis/yapper#7 —
/healthnow returns{"status": "ready"|"loading", "models": {"stt": bool, "tts": bool}}with 503 while loading. Mitzo can use this to show "loading models..." instead of hiding the mic.
CORS dependency
- Client-direct architecture requires Yapper to have permissive CORS (dimakis/yapper#5 adds
allow_origins=["*"]). Worth noting as a hard dependency in the doc.
Mixed content (future)
- If Mitzo is ever served over HTTPS,
MediaRecorderrequires a secure context and HTTP calls to Yapper would be blocked as mixed content. Not a blocker now (LAN-only), but worth a "Future Considerations" note.
Minor gaps
MAX_RECORDING_DURATION_MSmentioned as a constant but no value or auto-stop behavior defined.- Error UX for Yapper 500s or empty/too-short recordings not specified.
- Doc says
useLongPress"already exists" — verify before assuming reuse.
6 tasks
Client-direct architecture: frontend talks to Yapper for STT/TTS, server stays text-only. Three phases: batch STT, TTS playback, streaming STT. Each phase ships as a separate PR with tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
3ae8cf3 to
0a9aba2
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implementation phases
Test plan
🤖 Generated with Claude Code