Automatically cuts silence, splits audio into phrases, allows you to remove failed takes, and applies minimal professional audio processing.
- Silence Detection — automatically finds long pauses between phrases
- Speech Recognition — displays the text of each phrase (offline, via Vosk)
- Phrase Splitting — divides the recording into separate meaningful fragments
- Selective Deletion — interactive selection of failed takes for removal
- Compressor — evens out volume for better sound
- Gain — automatically raises volume to maximum without clipping
- Smart Trimming — trims phrase edges at zero crossings for silent junctions
- Export — saves the result in stereo WAV 48kHz 16-bit
- Python 3.8 or newer
- pip (Python package manager)
- Vosk speech recognition model for Russian language
git clone https://github.com/ncojam/audio_editor.git
cd audio-phrase-editorDownload one of the language models from the official Vosk website
Extract the archive into the "model" folder next to the script.
python -m venv venv
venv\Scripts\activatepython3 -m venv venv
source venv/bin/activateInstall dependencies, place a .wav file next to the script, and run it:
pip install -r requirements.txt
python audio_editor.py