Skip to content

wetware0/wetflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WetFlow

Push-to-talk transcription for Windows. Hold F12, speak, release — transcribed text is injected at the cursor. Runs entirely locally; no API key or internet connection required after first launch.

Download

Pre-built releases for Windows 10/11 (x64) are available on the Releases page. No .NET installation required.

  1. Download the latest WetFlow-vX.Y.Z-win-x64.zip from the Releases page.
  2. Right-click the ZIP → Extract All… and choose a permanent folder (e.g. C:\Program Files\WetFlow or %LOCALAPPDATA%\WetFlow). Don't run it directly from the ZIP.
  3. Open the extracted folder and double-click WetFlow.exe.
  4. Windows may show a SmartScreen warning ("Windows protected your PC") because the exe is unsigned — click More info → Run anyway.
  5. WetFlow appears in the system tray (bottom-right). On first launch it downloads the Whisper base model (~150 MB) — the tray tooltip will show "Downloading…" until it's ready.
  6. Hold F12, speak, then release. Transcribed text is injected at the cursor.

How it works

Push-to-talk (default):

[F12 ↓] → record mic  →  [F12 ↑] → Whisper transcribes → text injected at cursor
                                                         → chime + balloon tip
                                                         → tray icon turns green until you paste

Toggle mode:

[F12 ↓] → record mic  →  [F12 ↓ again] → Whisper transcribes → text injected at cursor
                                                              → chime + balloon tip
                                                              → tray icon turns green until you paste

The green icon indicates the transcribed text is still on the clipboard. It reverts to the normal icon once you paste (or replace the clipboard contents). It does not appear when Output mode is set to Keyboard only.

Audio is captured via WASAPI, resampled to 16 kHz mono, and transcribed by Whisper.net using a local GGML model downloaded on first use (~150 MB for the default base model). Text is injected via SendInput (Unicode); the Output mode setting controls delivery — Keyboard and clipboard (default) also writes to the clipboard, Keyboard only skips the clipboard, and Clipboard only skips SendInput entirely (useful when the target app blocks SendInput).

Requirements

To run (pre-built release):

  • Windows 10/11 (x64)
  • A microphone
  • (Optional) NVIDIA GPU with CUDA support — required for the "Use GPU" setting

To build from source:

Build & run

git clone https://github.com/wetware0/wetflow.git
cd wetflow
dotnet build src/WetFlow.csproj -c Release
.\src\bin\Release\net8.0-windows\WetFlow.exe

On first use, the Whisper base model (~150 MB) is downloaded automatically to %APPDATA%\wetflow\models\. Subsequent launches are instant.

Run on login

Right-click the tray icon and choose Settings, or create a startup shortcut:

Pre-built release (replace path with where you extracted the ZIP):

$exe = "C:\Path\To\WetFlow\WetFlow.exe"
$lnk = "$env:APPDATA\Microsoft\Windows\Start Menu\Programs\Startup\WetFlow.lnk"
$wsh = New-Object -ComObject WScript.Shell
$s = $wsh.CreateShortcut($lnk); $s.TargetPath = $exe; $s.Save()

Build from source (run from repo root after building):

$exe = "$PWD\src\bin\Release\net8.0-windows\WetFlow.exe"
$lnk = "$env:APPDATA\Microsoft\Windows\Start Menu\Programs\Startup\WetFlow.lnk"
$wsh = New-Object -ComObject WScript.Shell
$s = $wsh.CreateShortcut($lnk); $s.TargetPath = $exe; $s.Save()

Settings

Right-click the tray icon → Settings:

Setting Default Notes
Hotkey F12 Any key; modifier keys (Shift, Ctrl, …) work but can cause sticky-key behavior
Whisper model base tiny is fastest; small / medium are more accurate but slower and larger; .en variants (e.g. base.en) are English-only but faster; -q5_1 variants are quantized (smaller download, marginal quality tradeoff)
Short pause (sec) 0.5 Gap between Whisper segments that inserts a single newline (\n) in the output
Long pause (sec) 1.5 Gap between segments that inserts a blank line (\n\n); gaps below short pause are joined with a space
Toggle mode Off When on: press once to start, press again to stop. When off: hold to record, release to transcribe
Use GPU Off When on, uses NVIDIA CUDA for transcription (requires compatible GPU); falls back to CPU silently if GPU init fails — check %APPDATA%\wetflow\error.log
Output mode Keyboard and clipboard Controls how transcribed text is delivered. Keyboard only: SendInput only (fastest; some apps block it). Keyboard and clipboard: SendInput + writes to clipboard. Clipboard only: writes to clipboard only (use when SendInput doesn't work).

Settings are saved to %APPDATA%\wetflow\settings.json.

Troubleshooting

Errors are logged to %APPDATA%\wetflow\error.log with full stack traces.

Symptom Likely cause
No transcription, no error Recording was too short (<200 ms) — speak for longer before releasing (push-to-talk) or pressing again (toggle mode)
Tray shows "Downloading…" for a long time First-run model download; check your internet connection
Text injected in wrong case Target app is intercepting modifier keys — switch to a non-modifier hotkey
App won't start (second instance) Already running — check the system tray
"WetFlow Warning" balloon on startup or after settings save settings.json is corrupt or unreadable — app is using defaults. Check %APPDATA%\wetflow\error.log for details; delete settings.json to reset to defaults
GPU enabled but transcription is still slow GPU init failed and fell back to CPU — check %APPDATA%\wetflow\error.log for "GPU init failed"; verify NVIDIA drivers and CUDA are installed
"Audio saved to…" balloon appears Transcription returned empty or threw an error. The WAV recording was moved to %APPDATA%\wetflow\failed-audio\ so you can retry manually. Check %APPDATA%\wetflow\error.log for details.

Project structure

src/
  AppSettings.cs       — settings model, JSON load/save
  AudioRecorder.cs     — WASAPI capture, resample to 16 kHz mono WAV
  ClipboardMonitor.cs  — WM_CLIPBOARDUPDATE listener; detects when injected text is replaced
  KeyboardHook.cs      — global low-level keyboard hook (WH_KEYBOARD_LL)
  Program.cs           — entry point, single-instance mutex
  SettingsForm.cs      — hotkey capture + model selection dialog
  TextInjector.cs      — SendInput Unicode injection, OutputMode-controlled clipboard write
  Transcriber.cs       — Whisper.net local transcription, model auto-download
  TrayApp.cs           — orchestrator, tray icon, pipeline coordination

Releasing

To publish a new release, push a version tag:

git tag v1.0.1
git push origin v1.0.1

GitHub Actions will run tests, build a self-contained win-x64 ZIP, and create a GitHub release automatically.

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages