-
Notifications
You must be signed in to change notification settings - Fork 1
Faster qwen3 tts #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -528,6 +528,28 @@ | |
| nvidia-l4t-cuda-12: "nvidia-l4t-qwen-tts" | ||
| nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-qwen-tts" | ||
| icon: https://cdn-avatars.huggingface.co/v1/production/uploads/620760a26e3b7210c2ff1943/-s1gyJfvbE1RgO5iBeNOi.png | ||
| - &faster-qwen3-tts | ||
| urls: | ||
| - https://github.com/andimarafioti/faster-qwen3-tts | ||
| - https://pypi.org/project/faster-qwen3-tts/ | ||
| description: | | ||
| Real-time Qwen3-TTS inference using CUDA graph capture. Voice clone only; requires NVIDIA GPU with CUDA. | ||
| tags: | ||
| - text-to-speech | ||
| - TTS | ||
| - voice-clone | ||
| license: apache-2.0 | ||
| name: "faster-qwen3-tts" | ||
| alias: "faster-qwen3-tts" | ||
| capabilities: | ||
| nvidia: "cuda12-faster-qwen3-tts" | ||
| default: "cuda12-faster-qwen3-tts" | ||
| nvidia-cuda-13: "cuda13-faster-qwen3-tts" | ||
| nvidia-cuda-12: "cuda12-faster-qwen3-tts" | ||
| nvidia-l4t: "nvidia-l4t-faster-qwen3-tts" | ||
| nvidia-l4t-cuda-12: "nvidia-l4t-faster-qwen3-tts" | ||
| nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-faster-qwen3-tts" | ||
| icon: https://cdn-avatars.huggingface.co/v1/production/uploads/620760a26e3b7210c2ff1943/-s1gyJfvbE1RgO5iBeNOi.png | ||
| - &qwen-asr | ||
| urls: | ||
| - https://github.com/QwenLM/Qwen3-ASR | ||
|
|
@@ -2279,6 +2301,57 @@ | |
| uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-qwen-tts" | ||
| mirrors: | ||
| - localai/localai-backends:master-metal-darwin-arm64-qwen-tts | ||
| ## faster-qwen3-tts | ||
| - !!merge <<: *faster-qwen3-tts | ||
|
Comment on lines
2301
to
+2305
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The Suggested fix- !!merge <<: *faster-qwen3-tts
name: "faster-qwen3-tts-development"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-faster-qwen3-tts"
mirrors:
- localai/localai-backends:master-gpu-nvidia-cuda-12-faster-qwen3-tts
capabilities:
nvidia: "cuda12-faster-qwen3-tts-development"
default: "cuda12-faster-qwen3-tts-development"
nvidia-cuda-13: "cuda13-faster-qwen3-tts-development"
nvidia-cuda-12: "cuda12-faster-qwen3-tts-development"
nvidia-l4t: "nvidia-l4t-faster-qwen3-tts-development"
nvidia-l4t-cuda-12: "nvidia-l4t-faster-qwen3-tts-development"
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-faster-qwen3-tts-development"Prompt for AI assistanceCopy the prompt below and paste it into ChatGPT, Claude, or any LLM: |
||
| name: "faster-qwen3-tts-development" | ||
| capabilities: | ||
| nvidia: "cuda12-faster-qwen3-tts-development" | ||
| default: "cuda12-faster-qwen3-tts-development" | ||
| nvidia-cuda-13: "cuda13-faster-qwen3-tts-development" | ||
| nvidia-cuda-12: "cuda12-faster-qwen3-tts-development" | ||
| nvidia-l4t: "nvidia-l4t-faster-qwen3-tts-development" | ||
| nvidia-l4t-cuda-12: "nvidia-l4t-faster-qwen3-tts-development" | ||
| nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-faster-qwen3-tts-development" | ||
| - !!merge <<: *faster-qwen3-tts | ||
| name: "cuda12-faster-qwen3-tts" | ||
| uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-faster-qwen3-tts" | ||
| mirrors: | ||
| - localai/localai-backends:latest-gpu-nvidia-cuda-12-faster-qwen3-tts | ||
| - !!merge <<: *faster-qwen3-tts | ||
| name: "cuda12-faster-qwen3-tts-development" | ||
| uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-faster-qwen3-tts" | ||
| mirrors: | ||
| - localai/localai-backends:master-gpu-nvidia-cuda-12-faster-qwen3-tts | ||
| - !!merge <<: *faster-qwen3-tts | ||
| name: "cuda13-faster-qwen3-tts" | ||
| uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-13-faster-qwen3-tts" | ||
| mirrors: | ||
| - localai/localai-backends:latest-gpu-nvidia-cuda-13-faster-qwen3-tts | ||
| - !!merge <<: *faster-qwen3-tts | ||
| name: "cuda13-faster-qwen3-tts-development" | ||
| uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-13-faster-qwen3-tts" | ||
| mirrors: | ||
| - localai/localai-backends:master-gpu-nvidia-cuda-13-faster-qwen3-tts | ||
| - !!merge <<: *faster-qwen3-tts | ||
| name: "nvidia-l4t-faster-qwen3-tts" | ||
| uri: "quay.io/go-skynet/local-ai-backends:latest-nvidia-l4t-faster-qwen3-tts" | ||
| mirrors: | ||
| - localai/localai-backends:latest-nvidia-l4t-faster-qwen3-tts | ||
| - !!merge <<: *faster-qwen3-tts | ||
| name: "nvidia-l4t-faster-qwen3-tts-development" | ||
| uri: "quay.io/go-skynet/local-ai-backends:master-nvidia-l4t-faster-qwen3-tts" | ||
| mirrors: | ||
| - localai/localai-backends:master-nvidia-l4t-faster-qwen3-tts | ||
| - !!merge <<: *faster-qwen3-tts | ||
| name: "cuda13-nvidia-l4t-arm64-faster-qwen3-tts" | ||
| uri: "quay.io/go-skynet/local-ai-backends:latest-nvidia-l4t-cuda-13-arm64-faster-qwen3-tts" | ||
| mirrors: | ||
| - localai/localai-backends:latest-nvidia-l4t-cuda-13-arm64-faster-qwen3-tts | ||
| - !!merge <<: *faster-qwen3-tts | ||
| name: "cuda13-nvidia-l4t-arm64-faster-qwen3-tts-development" | ||
| uri: "quay.io/go-skynet/local-ai-backends:master-nvidia-l4t-cuda-13-arm64-faster-qwen3-tts" | ||
| mirrors: | ||
| - localai/localai-backends:master-nvidia-l4t-cuda-13-arm64-faster-qwen3-tts | ||
| ## qwen-asr | ||
| - !!merge <<: *qwen-asr | ||
| name: "qwen-asr-development" | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,23 @@ | ||
| .PHONY: faster-qwen3-tts | ||
| faster-qwen3-tts: | ||
| bash install.sh | ||
|
|
||
| .PHONY: run | ||
| run: faster-qwen3-tts | ||
| @echo "Running faster-qwen3-tts..." | ||
| bash run.sh | ||
| @echo "faster-qwen3-tts run." | ||
|
|
||
| .PHONY: test | ||
| test: faster-qwen3-tts | ||
| @echo "Testing faster-qwen3-tts..." | ||
| bash test.sh | ||
| @echo "faster-qwen3-tts tested." | ||
|
|
||
| .PHONY: protogen-clean | ||
| protogen-clean: | ||
| $(RM) backend_pb2_grpc.py backend_pb2.py | ||
|
|
||
| .PHONY: clean | ||
| clean: protogen-clean | ||
| rm -rf venv __pycache__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Selecting the new backend with capability
defaulton a CPU-only host now resolves tocuda12-faster-qwen3-tts, so a request likebackend: faster-qwen3-ttswithout NVIDIA hardware will always pull a CUDA image and fail at runtime becausebackend.pyrejectstorch.cuda.is_available()==False.Remove the
defaultcapability for this backend or point it to no image/CPU-safe fallback so generic backend selection does not choose a CUDA-only container on non-NVIDIA machines.Prompt for AI assistance
Copy the prompt below and paste it into ChatGPT, Claude, or any LLM: