Zero token counts for empty transcripts and README alignment by gistrec · Pull Request #43 · gistrec/ClearTranscriptBot

gistrec · 2026-01-20T16:20:46Z

Avoid counting tokens for the user-facing fallback text when transcription is empty and ensure zero token counts are recorded for empty transcriptions.
Keep stored text user-friendly while basing token accounting on the raw recognition output.
Fix README schema formatting so the llm_tokens_by_model column aligns with other columns for readability.

Update utils/tokens.py so tokens_by_model returns zeros for every model when text.strip() is empty and keep LLM_TOKEN_MODELS-based mapping.
Change schedulers/transcription.py to compute token_counts = tokens_by_model(raw_text) using the raw parse_text(result) output and then replace empty text with the friendly fallback string before persisting results.
Persist llm_tokens_by_model=token_counts on both successful and failed updates in update_transcription calls.
Adjust README.md spacing for the llm_tokens_by_model JSON column to align with other schema columns.

No automated tests were run for this change.
Local static inspection and manual review of modified files were performed during the rollout and changes were committed successfully.

utils/tokens.py

Zero token counts for empty transcripts

5c80731

gistrec added the codex label Jan 20, 2026 — with ChatGPT Codex Connector

sentry bot reviewed Jan 20, 2026

View reviewed changes

utils/tokens.py Show resolved Hide resolved

fixup! Zero token counts for empty transcripts

0d272de

Provide feedback