Skip to content

Add speech-runner: Swift inference for Whisper model#54

Open
carinapeng wants to merge 6 commits into
apple:mainfrom
carinapeng:carina/speech-runner
Open

Add speech-runner: Swift inference for Whisper model#54
carinapeng wants to merge 6 commits into
apple:mainfrom
carinapeng:carina/speech-runner

Conversation

@carinapeng

Copy link
Copy Markdown
Contributor

Purpose

Swift CLI that loads a CoreAI Whisper export and transcribes audio

Changes

  • swift/Sources/Tools/speech-runner/SpeechRunnerMain.swift — loads either export format, greedy decodes with forced prefix + KV cache, decodes tokens to text via bundled tokenizer
  • swift/Sources/Tools/speech-runner/WhisperMel.swift — mel spectrogram computation in Swift
  • Package.swift — registers speech-runner target

Testing

Usage
swift run speech-runner <model-path> <audio.flac>

Tested running both converted model from ToT main and reauthored model from branch carina/whisper. No impact on model performance or export quality in this PR

@carinapeng carinapeng marked this pull request as draft June 17, 2026 21:14
@carinapeng carinapeng marked this pull request as ready for review June 17, 2026 21:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant