When changing the model of transcription from base.en to small.en, the whole system stucks and UI is not updated

Hi,

So lets say if i change the model from base.en to small.en, there appears to be a race condition which is not seen in base.en model.
The model detects my voice correctly as i can see in the logs but the system does not send my text further to llm engine.

DEFAULT_RECORDER_CONFIG: Dict[str, Any] = {
    "use_microphone": False,
    "spinner": False,
    "model": "small.en",
    "realtime_model_type": "small.en",
    "use_main_model_for_realtime": False,
    "language": "en", # Default, will be overridden by source_language in __init__
    "silero_sensitivity": 0.05,
    "webrtc_sensitivity": 3,
    "post_speech_silence_duration": 0.7,
    "min_length_of_recording": 0.5,
    "min_gap_between_recordings": 0,
    "enable_realtime_transcription": True,
    "realtime_processing_pause": 0.03,
    "silero_use_onnx": True,
    "silero_deactivity_detection": True,
    "early_transcription_on_silence": 0,
    "beam_size": 3,
    "beam_size_realtime": 3,
    "no_log_file": True,
    "wake_words": "jarvis",
    "wakeword_backend": "pvporcupine",
    "allowed_latency_limit": 500,
    # Callbacks will be added dynamically in _create_recorder
    "debug_mode": True,
    "initial_prompt_realtime": "The sky is blue. When the sky... She walked home. Because he... Today is sunny. If only I...",
    "faster_whisper_vad_filter": False,
}

The base.en model is not perfect for transcription. Has any one faced this issue ? this is the log when using small.en

41:32.33 server     INFO 🖥️🚦 State  ToClient 0, ttsClientON 0, ChunkSent 0, hot 0, synth 0 gen 0 valid 0 tts_q_fin 0 mic_inter 0
41:32.55 uvicorn.ac INFO 127.0.0.1:65010 - "GET /static/pcmWorkletProcessor.js HTTP/1.1" 200
41:32.56 uvicorn.ac INFO 127.0.0.1:65010 - "GET /static/ttsPlaybackProcessor.js HTTP/1.1" 200
41:34.18 transcribe INFO 👂▶️ Recording started.
41:34.18 server     INFO 🖥️🎙️ Recording started. TTS Client Playing: False
41:34.19 faster_whi INFO Processing audio with duration 00:01.024
41:34.19 faster_whi INFO VAD filter removed 00:00.000 of audio
41:34.70 transcribe INFO 👂🤫 Silence state changed: ACTIVE


HOT
41:35.05 server     INFO 🖥️🧠 HOT: None
4
41:35.14 transcribe INFO 👂🔚 Potential sentence end detected (timed out):
---lots of timed out events...just omitting those log lines 
41:53.47 transcribe INFO 👂⏹️ Recording stopped.
41:53.48 server     INFO 🖥️🏁 =================== USER TURN END ===================
41:53.48 server     INFO 🖥️🎙️ ⏸️ Microphone interrupted (end of turn)
41:53.48 server     INFO 🖥️🔊 TTS STREAM RELEASED
41:53.48 transcribe INFO 👂🔚 Potential sentence end detected (timed out):
41:53.48 server     INFO 🖥️🧠 Adding user request to history: 'Yo buddy!'
41:53.48 server     INFO 🖥️📤 →→Client: {'type': 'final_user_request', 'content': 'Yo buddy!'}
41:53.48 transcribe INFO 👂🤫 Silence state changed: INACTIVE
41:53.48 transcribe INFO 👂🔚 Potential sentence end detected (timed out): 
41:53.48 server     INFO 🖥️🚦 State  ToClient 1, ttsClientON 0, ChunkSent 0, hot 0, synth 0 gen 0 valid 0 tts_q_fin 0 mic_inter 1
41:53.49 faster_whi INFO Processing audio with duration 00:02.240
41:53.49 faster_whi INFO VAD filter removed 00:00.000 of audio
41:55.48 server     INFO 🖥️🎙️ interruption flag reset after 2 seconds
41:55.48 server     INFO 🖥️🚦 State  ToClient 1, ttsClientON 0, ChunkSent 0, hot 0, synth 0 gen 0 valid 0 tts_q_fin 0 mic_inter 0
41:56.22 transcribe INFO 👂✅ Final user text:  Are you listening?
41:56.22 turndetect INFO 🎤🔄 Resetting TurnDetection state.
41:56.22 server     INFO
🖥️✅ FINAL USER REQUEST (STT Callback): Are you listening?
42:18.34 uvicorn.er INFO Shutting down
42:18.34 server     ERRO 🖥️💥 RUNTIME_ERROR in process_incoming_data: RuntimeError('Cannot call "receive" once a disconnect message has been received.')
42:18.34 uvicorn.er INFO connection closed
42:18.34 audio_in   INFO 👂🚫 Audio processing task cancelled.
42:18.34 audio_in   INFO 👂⏹️ Audio chunk processing loop finished.
42:18.34 server     INFO 🖥️🧹 Cleaning up WebSocket tasks...
42:18.34 server     INFO 🖥️❌ WebSocket session ended.
42:18.44 uvicorn.er INFO Waiting for application shutdown.
42:18.44 server     INFO 🖥️⏹️ Server shutting down
42:18.44 audio_in   INFO 👂🛑 Shutting down AudioInputProcessor...
42:18.44 audio_in   INFO 👂🛑 Signaling TranscriptionProcessor to shut down.
42:18.44 transcribe INFO 👂🔌 Shutting down TranscriptionProcessor...
42:18.44 transcribe INFO 👂🔌 Calling recorder shutdown()...
RealtimeSTT shutting down
42:19.83 transcribe INFO 👂🔌 Recorder shutdown() method completed.
42:19.83 transcribe INFO 👂🔌 TranscriptionProcessor shutdown process finished.
42:19.83 audio_in   INFO 👂🚫 Cancelling background transcription task (Task-3)...
42:19.83 audio_in   INFO 👂👋 AudioInputProcessor shutdown sequence initiated.
42:19.83 audio_in   INFO 👂🚫 Transcription loop (Task-3) cancelled.
42:19.83 audio_in   INFO 👂⏹️ Background transcription task (Task-3) finished.
42:19.83 uvicorn.er INFO Application shutdown complete.
42:19.83 uvicorn.er INFO Finished server process [52524]


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When changing the model of transcription from base.en to small.en, the whole system stucks and UI is not updated #47

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

When changing the model of transcription from base.en to small.en, the whole system stucks and UI is not updated #47

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions