Title: Read Aloud UX: buffering gaps, document switching issues, and playback state synchronization
First: Mayari is the closest thing I've found to a true local AI audiobook reader. The overall concept is excellent and the native Kokoro integration is exactly the direction I'd like to see for long-form reading.
However, there are several issues around Read Aloud mode that make audiobook listening feel fragmented and unreliable.
1. Large pauses between chunks during live Read Aloud
Current behavior:
- Document is split into chunks/paragraphs
- A chunk is spoken
- Playback pauses
- "Preparing audio..." appears
- The next chunk is generated
- Playback resumes
This creates noticeable gaps between chunks.
As a listener, it constantly reminds me that I'm hearing thousands of separate generated clips rather than one continuous audiobook.
Interestingly, generated audiobook files sound much better because the gaps are far less noticeable.
Suggestions:
- Generate upcoming chunks while the current chunk is still playing
- Maintain a rolling audio buffer
- Add an aggressive pre-buffering mode
- Allow users to configure pause length between chunks
- Optionally merge multiple paragraphs into larger synthesis units
For audiobook listening, seamless playback is arguably more important than voice quality.
2. Read Aloud sometimes remains attached to a previous document
I also frequently encounter situations where the visible document and the active Read Aloud source become desynchronized.
Example:
- Open Book A
- Start Read Aloud
- Open Book B
- The UI correctly displays Book B
- Press Read Aloud
- Playback continues from Book A
The text pane clearly shows the new document, but the TTS engine appears to remain attached to the previous document/session.
This creates a confusing situation where:
- The screen shows one book
- Audio comes from another book
Sometimes the app eventually switches to the new document on its own, but often the only reliable workaround is restarting the application.
Expected behavior:
Whenever a new document becomes active, Read Aloud should:
- Automatically bind to the newly active document
or
- Stop playback and clearly ask whether playback should continue from the new document
A visible indicator such as:
"Current TTS source: [Document Name]"
would also make the current playback source obvious.
##Overall
Both issues feel related to playback state management.
The document reader itself updates correctly, but the Read Aloud subsystem often feels detached from what is currently visible on screen.
The result is that the reading experience sometimes feels fragmented:
- Audible pauses between chunks
- Unclear playback ownership
- Occasional playback from the wrong document
Mayari already has the foundations of an excellent local AI audiobook reader. Improving buffering and document/TTS synchronization would dramatically improve the day-to-day listening experience.
Title: Read Aloud UX: buffering gaps, document switching issues, and playback state synchronization
First: Mayari is the closest thing I've found to a true local AI audiobook reader. The overall concept is excellent and the native Kokoro integration is exactly the direction I'd like to see for long-form reading.
However, there are several issues around Read Aloud mode that make audiobook listening feel fragmented and unreliable.
1. Large pauses between chunks during live Read Aloud
Current behavior:
This creates noticeable gaps between chunks.
As a listener, it constantly reminds me that I'm hearing thousands of separate generated clips rather than one continuous audiobook.
Interestingly, generated audiobook files sound much better because the gaps are far less noticeable.
Suggestions:
For audiobook listening, seamless playback is arguably more important than voice quality.
2. Read Aloud sometimes remains attached to a previous document
I also frequently encounter situations where the visible document and the active Read Aloud source become desynchronized.
Example:
The text pane clearly shows the new document, but the TTS engine appears to remain attached to the previous document/session.
This creates a confusing situation where:
Sometimes the app eventually switches to the new document on its own, but often the only reliable workaround is restarting the application.
Expected behavior:
Whenever a new document becomes active, Read Aloud should:
or
A visible indicator such as:
"Current TTS source: [Document Name]"
would also make the current playback source obvious.
##Overall
Both issues feel related to playback state management.
The document reader itself updates correctly, but the Read Aloud subsystem often feels detached from what is currently visible on screen.
The result is that the reading experience sometimes feels fragmented:
Mayari already has the foundations of an excellent local AI audiobook reader. Improving buffering and document/TTS synchronization would dramatically improve the day-to-day listening experience.