Skip to content

[ADD] Enable/Disable Button. UI Enhancements, Info Icon, Fixed Text L…#232

Open
AnmollGarg wants to merge 10 commits into
CITOpenRep:release/v1.2.9from
AnmollGarg:feature/voice_2_text
Open

[ADD] Enable/Disable Button. UI Enhancements, Info Icon, Fixed Text L…#232
AnmollGarg wants to merge 10 commits into
CITOpenRep:release/v1.2.9from
AnmollGarg:feature/voice_2_text

Conversation

@AnmollGarg

Copy link
Copy Markdown

Summary

This PR introduces voice-to-text enhancements, including UI improvements, dynamic timeout configurations, backend logic updates, and a toggle button to enable or disable the feature. It also addresses a text live-updation bug on the "Read More" page.

Type

  • Feature
  • Bug Fix
  • Refactor
  • Documentation
  • Performance
  • Testing

Motivation

The goal is to improve the user experience and reliability of the voice-to-text functionality. Users needed a way to toggle the feature on/off, clear visual indicators (info icon), and more robust text updates in real-time. Dynamic timeouts ensure the system accommodates varying speech patterns.

Changes

  1. UI Enhancements & Features: Added an Enable/Disable button, informational icons, and settings pages (Settings_Page.qml, Settings_VoiceModel.qml).
  2. Text Synchronization: Fixed an issue where text was not updating live on the "Read More" page (RichTextEditor.qml, RichTextPreview.qml, HtmlEditorToolbar.qml).
  3. Backend & Configuration: Refactored src/backend.py and voice_to_text/voice2text.py to handle dynamic timeouts and core staging logic.

Testing

  • Tested on Emulator (clickable build && clickable review)
  • Tested on device (clickable install)
  • Unit tests added/updated
  • No existing Functionality/Tests broken

Screenshots (if UI change)

Before After

Checklist

  • Code follows existing style conventions
  • No hardcoded credentials or secrets
  • No debug logs left in production code
  • Documentation updated (if applicable)
  • Squash commits before review (if needed)
  • PR is up-to-date with main or release\x.x.x branch

Related Issues

  1. Voice-to-Text Feature Not Working with Large Data Models #231
  2. Voice to Text is Not working in Rich Text based Read More Page. #230
  3. Info Button for Voice to Text Feature. #227
  4. Enhancement in Voice to text Feature #226
  5. Voice-to-Text Feature Produces Inaccurate Transcriptions #224
  6. Delay in Voice Capture and Language Recognition Behavior During Speech Input #220

Additional Notes

The backend changes introduce dynamic timeout staging which directly correlates with the new settings UI. Ensure the speech model is downloaded/available when testing on-device.

AnmollGarg and others added 3 commits June 12, 2026 15:03
…ive Updation in Read More Page, Dynamic Timeout, Stagging
Co-authored-by: AnmollGarg <anmolgarg12203@gmail.com>
 [ADD] Search Feature, Renamed the Page Name
@suraj-yadav0 suraj-yadav0 self-assigned this Jun 12, 2026
@suraj-yadav0 suraj-yadav0 self-requested a review June 12, 2026 11:32
@suraj-yadav0 suraj-yadav0 added bug Something isn't working enhancement New feature or request labels Jun 12, 2026
@suraj-yadav0

Copy link
Copy Markdown
Collaborator

Hi @AnmollGarg The Bug In Read More Page persists.

  1. Speech for Text is not available in Timesheet's Read More Page.
  2. In RIch Text Based Read More Page , Text is Added Twice into the Text Area. Even the Word Listening is Inserted.

@AnmollGarg

Copy link
Copy Markdown
Author

Hi @AnmollGarg The Bug In Read More Page persists.

  1. Speech for Text is not available in Timesheet's Read More Page.
  2. In RIch Text Based Read More Page , Text is Added Twice into the Text Area. Even the Word Listening is Inserted.

Hi @suraj-yadav0

I have fixed the bug you mentioned.

@suraj-yadav0 suraj-yadav0 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a user is actively recording speech-to-text and closes the page or navigates back, the background arecord/ffmpeg subprocess remains active until the silence timeout (7 seconds) or hard timeout (300 seconds) is reached.

Refere these files, voice2.text.py
ReadMorePage.qml, RichTextEditor.qml, and RichTextPreview.qml

Add cleanup handlers to ensure active voice processes are explicitly stopped when components are destroyed or navigated away..

Something like ........Component.onDestruction{
if(listening || processing) {
backend_bridge.call(backend.stop_voice_recognition)
}}

@suraj-yadav0

Copy link
Copy Markdown
Collaborator

@AnmollGarg @parvathyabnair I also recommend Putting a Stop Action Button on the Voice Over Pop-over which will make it more user intuitive.

@parvathyabnair

Copy link
Copy Markdown
Contributor

@suraj-yadav0 Can you please review the changes, and let me know if there is any issues?

Comment thread src/backend.py

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In backend.py the voice recognition subprocess is started using Python's default multiprocessing.Process. On Linux, this defaults to the fork start method. When executed inside a Qt/QML context (via PyOtherSide), forking duplicates the parent application's memory and handles (including active C++ Qt structures and SQLite database descriptors) while keeping only the calling thread alive. This regularly leads to deadlocks or memory corruption when the child process attempts to execute or terminate.

Sol: Explicitly start the subprocess using a spawn context to create a completely clean Python interpreter instance:

import multiprocessing
ctx = multiprocessing.get_context("spawn")
p = ctx.Process(target=_subprocess_recognize, args=(str(model_path), child_conn, 30))
p.start()

Comment thread src/backend.py

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backend runs a message loop while p.is_alive():. If the recording subprocess hangs on a blocking read (e.g. arecord or ffmpeg locks stdout without exiting), the child process will never poll the pipe for the "stop" instruction. In this scenario, the parent loop will spin/hang indefinitely, leaking background threads in the Python bridge.

Suggested Sol: Implement a watchdog timer in the parent process to terminate the child if it doesn't respond to the stop signal:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a Python exception is raised inside the decoding loop (e.g., in rec.AcceptWaveform(data) or json.loads), control immediately transfers to the except block. Because process.terminate() or process.wait() are not called in the exception path, the recording process (arecord / ffmpeg) remains running, consuming background resources.

Sol : Protect the subprocess lifecycle using a try...finally block

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In
Settings_VoiceModel.qml
ReadMorePage.qml
RichTextEditor.qml
RichTextPreview.qml

The voice integration settings introduce a new SQLite file UBTMS_SettingsDB, whereas all other configuration pages (Theme, Accounts, Notifications, Sync) use myDatabase. This creates database fragmentation and breaks structural pattern consistency.

Sol : Consolidate voice settings under myDatabase by replacing the database open call

var db = Sql.LocalStorage.openDatabaseSync("myDatabase", "1.0", "My Database", 1000000);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In ReadMorePage.qml (and RichText views)
If a user navigates to the Read More page before visiting settings on a fresh launch, the database query SELECT value FROM app_settings is executed before the table is ever created, throwing an SQL exception. While wrapped in try/catch, it leads to messy logs.

Solution : Always run CREATE TABLE IF NOT EXISTS app_settings (key TEXT PRIMARY KEY, value TEXT) inside the transaction block before querying.


function isModelCompatible(sizeStr) {
if (!sizeStr) return true;
var isG = sizeStr.indexOf("G") !== -1;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compatibility RAM check depends on indexOf("G") case-sensitivity. It might fail if a model size is labeled with "gb".

Solution : Normalize size string to uppercase and clean it using regex

var upperSize = sizeStr.toUpperCase();
var isG = upperSize.indexOf("G") !== -1;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants