Faster qwen3 tts#9
Conversation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
|
@codity review |
PR SummaryWhat Changed
Key Changes by AreaBackend Implementation: New gRPC server at Build System: Added to CI/CD: New CUDA 12/13 builds for Files Changed
Review Focus Areas
ArchitectureDesign Decisions: Voice-clone only (no base voice synthesis) keeps scope narrow. CUDA graph capture trades flexibility for latency. Separate requirements files per CUDA version and platform (L4T vs desktop) manage dependency complexity explicitly. Scalability & Extensibility: Backend follows existing gRPC pattern. 8 variants in index.yaml allow version pinning but increase maintenance surface. Out of scope: CPU fallback or quantization. Risks: Intentional: NVIDIA-only, no CPU support. Unintentional: L4T wheel URLs may rot; test coverage gaps for actual hardware (tests gate on CUDA availability). Merge StatusNOT MERGEABLE — PR Score 13/100, below threshold (50)
|
Workflow DiagramsAutomatically generated sequence diagrams showing the workflows in this PR 1. ## AnalysisThis PR introduces a new TTS (Text-to-Speech... Complex complexity • Components: faster-qwen3-tts gRPC backend service, GitHub Actions CI/CD workflow, Docker build matrix sequenceDiagram
title: faster-qwen3-tts Backend Integration Workflow
participant Dev as Developer
participant GH as GitHub Actions CI
participant MK as Makefile
participant DK as Docker Build
participant BI as Backend Index
participant LA as LocalAI Runtime
Note over Dev,LA: PR introduces faster-qwen3-tts TTS backend with multi-platform CUDA support
Dev->>MK: Add BACKEND_FASTER_QWEN3_TTS metadata<br/>Makefile:2/3 chunk
MK-->>Dev: Register python backend with root context
Dev->>GH: Push changes triggering backend.yml workflow
par Parallel Build Matrix
GH->>DK: Build CUDA 12 amd64<br/>backend.yml:210+
Note right of DK: cublas, cuda 12.8, ubuntu 24.04
DK-->>GH: Image: local-ai-backends:latest-gpu-nvidia-cuda-12-faster-qwen3-tts
GH->>DK: Build CUDA 13 amd64<br/>backend.yml:588+
Note right of DK: cublas, cuda 13.0, ubuntu 24.04
DK-->>GH: Image: local-ai-backends:latest-gpu-nvidia-cuda-13-faster-qwen3-tts
GH->>DK: Build L4T CUDA 12 arm64<br/>backend.yml:1345+
Note right of DK: NVIDIA Jetson, arm64, ubuntu 22.04
DK-->>GH: Image: local-ai-backends:latest-nvidia-l4t-faster-qwen3-tts
GH->>DK: Build L4T CUDA 13 arm64<br/>backend.yml:731+
Note right of DK: NVIDIA Jetson, arm64, ubuntu 24.04
DK-->>GH: Image: local-ai-backends:latest-nvidia-l4t-cuda-13-arm64-faster-qwen3-tts
end
GH->>BI: Register backend variants<br/>backend/index.yaml
Note right of BI: YAML anchors define image URIsbr/for runtime discovery
LA->>BI: Query available backends at startup
BI-->>LA: Return faster-qwen3-tts image URIs<br/>for detected platform
alt CUDA GPU available
LA->>DK: Pull and run faster-qwen3-tts container
DK-->>LA: gRPC service ready on configured port
LA->>DK: TTS request via gRPC
DK-->>LA: Audio output stream
else CPU only or CUDA unavailable
LA->>LA: Skip faster-qwen3-tts registration
Note right of LA: Backend requires NVIDIA GPUbr/Graceful degradation to CPU TTS
end
Note: Diagrams show detected patterns only. Complex workflows may require manual review. |
| capabilities: | ||
| nvidia: "cuda12-faster-qwen3-tts" | ||
| default: "cuda12-faster-qwen3-tts" | ||
| nvidia-cuda-13: "cuda13-faster-qwen3-tts" |
There was a problem hiding this comment.
Selecting the new backend with capability default on a CPU-only host now resolves to cuda12-faster-qwen3-tts, so a request like backend: faster-qwen3-tts without NVIDIA hardware will always pull a CUDA image and fail at runtime because backend.py rejects torch.cuda.is_available()==False.
Remove the default capability for this backend or point it to no image/CPU-safe fallback so generic backend selection does not choose a CUDA-only container on non-NVIDIA machines.
Prompt for AI assistance
Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:
You are an expert yaml developer with deep knowledge of security, performance, and best practices.
### Context
File: backend/index.yaml
Lines: 544-547
Issue Type: functional-high
Severity: high
Issue Description:
Selecting the new backend with capability `default` on a CPU-only host now resolves to `cuda12-faster-qwen3-tts`, so a request like `backend: faster-qwen3-tts` without NVIDIA hardware will always pull a CUDA image and fail at runtime because `backend.py` rejects `torch.cuda.is_available()==False`.
Current Code:
capabilities:
nvidia: "cuda12-faster-qwen3-tts"
default: "cuda12-faster-qwen3-tts"
nvidia-cuda-13: "cuda13-faster-qwen3-tts"
nvidia-cuda-12: "cuda12-faster-qwen3-tts"
---
### Instructions
1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow yaml best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed
### Constraints
- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready
---
| if not request.dst: | ||
| return backend_pb2.Result( | ||
| success=False, | ||
| message="dst (output path) is required" | ||
| ) | ||
| text = request.text.strip() | ||
| if not text: | ||
| return backend_pb2.Result( | ||
| success=False, | ||
| message="Text is empty" | ||
| ) | ||
|
|
||
| language = request.language if hasattr(request, 'language') and request.language else None | ||
| if not language or language == "": | ||
| language = "English" | ||
|
|
||
| ref_audio = self._get_ref_audio_path(request) | ||
| if not ref_audio: | ||
| return backend_pb2.Result( | ||
| success=False, | ||
| message="AudioPath is required for voice clone (set in LoadModel)" | ||
| ) | ||
| ref_text = self.options.get("ref_text") | ||
| if not ref_text and hasattr(request, 'ref_text') and request.ref_text: | ||
| ref_text = request.ref_text | ||
| if not ref_text: | ||
| return backend_pb2.Result( | ||
| success=False, | ||
| message="ref_text is required for voice clone (set via LoadModel Options, e.g. ref_text:Your reference transcript)" | ||
| ) | ||
|
|
||
| chunk_size = self.options.get("chunk_size") | ||
| generation_kwargs = {} | ||
| if chunk_size is not None: | ||
| generation_kwargs["chunk_size"] = int(chunk_size) | ||
|
|
||
| audio_list, sr = self.model.generate_voice_clone( | ||
| text=text, | ||
| language=language, | ||
| ref_audio=ref_audio, | ||
| ref_text=ref_text, | ||
| **generation_kwargs | ||
| ) | ||
|
|
||
| if audio_list is None or (isinstance(audio_list, list) and len(audio_list) == 0): | ||
| return backend_pb2.Result( | ||
| success=False, | ||
| message="No audio output generated" | ||
| ) | ||
| audio_data = audio_list[0] if isinstance(audio_list, list) else audio_list | ||
| sf.write(request.dst, audio_data, sr) |
There was a problem hiding this comment.
The server writes to a client-controlled filesystem path without validation, so restrict dst to an allowed output directory and reject absolute or traversing paths.
Suggested fix
if not request.dst:
return backend_pb2.Result(
success=False,
message="dst (output path) is required"
)
output_base = os.environ.get("LOCALAI_OUTPUT_DIR", "/tmp/localai")
requested_path = os.path.normpath(request.dst)
if os.path.isabs(requested_path):
return backend_pb2.Result(success=False, message="absolute dst paths are not allowed")
final_path = os.path.normpath(os.path.join(output_base, requested_path))
if not final_path.startswith(os.path.abspath(output_base) + os.sep):
return backend_pb2.Result(success=False, message="invalid dst path")
...
sf.write(final_path, audio_data, sr)Prompt for AI assistance
Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:
You are an expert python developer with deep knowledge of security, performance, and best practices.
### Context
File: backend/python/faster-qwen3-tts/backend.py
Lines: 99-149
Issue Type: security-high
Severity: high
Issue Description:
The server writes to a client-controlled filesystem path without validation, so restrict dst to an allowed output directory and reject absolute or traversing paths.
Current Code:
if not request.dst:
return backend_pb2.Result(
success=False,
message="dst (output path) is required"
)
...
sf.write(request.dst, audio_data, sr)
---
### Instructions
1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow python best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed
### Constraints
- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready
---
| if not self.audio_path: | ||
| return None | ||
| if os.path.isabs(self.audio_path): | ||
| return self.audio_path | ||
| if self.model_file: | ||
| model_file_base = os.path.dirname(self.model_file) | ||
| ref_path = os.path.join(model_file_base, self.audio_path) | ||
| if os.path.exists(ref_path): | ||
| return ref_path | ||
| if self.model_path: | ||
| ref_path = os.path.join(self.model_path, self.audio_path) | ||
| if os.path.exists(ref_path): | ||
| return ref_path | ||
| return self.audio_path |
There was a problem hiding this comment.
AudioPath is accepted as an absolute path and otherwise returned unchecked, so validate it against an allowlisted base directory before opening local files.
Suggested fix
def _get_ref_audio_path(self, request):
if not self.audio_path:
return None
allowed_bases = []
if self.model_file:
allowed_bases.append(os.path.abspath(os.path.dirname(self.model_file)))
if self.model_path:
allowed_bases.append(os.path.abspath(self.model_path))
for base in allowed_bases:
candidate = os.path.abspath(os.path.join(base, self.audio_path))
if candidate.startswith(base + os.sep) and os.path.exists(candidate):
return candidate
return NonePrompt for AI assistance
Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:
You are an expert python developer with deep knowledge of security, performance, and best practices.
### Context
File: backend/python/faster-qwen3-tts/backend.py
Lines: 82-95
Issue Type: security-high
Severity: high
Issue Description:
AudioPath is accepted as an absolute path and otherwise returned unchecked, so validate it against an allowlisted base directory before opening local files.
Current Code:
def _get_ref_audio_path(self, request):
if not self.audio_path:
return None
if os.path.isabs(self.audio_path):
return self.audio_path
if self.model_file:
model_file_base = os.path.dirname(self.model_file)
ref_path = os.path.join(model_file_base, self.audio_path)
if os.path.exists(ref_path):
return ref_path
if self.model_path:
ref_path = os.path.join(self.model_path, self.audio_path)
if os.path.exists(ref_path):
return ref_path
return self.audio_path
---
### Instructions
1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow python best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed
### Constraints
- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready
---
| model_path = request.Model or "Qwen/Qwen3-TTS-12Hz-0.6B-Base" | ||
| self.audio_path = request.AudioPath if hasattr(request, 'AudioPath') and request.AudioPath else None | ||
| self.model_file = request.ModelFile if hasattr(request, 'ModelFile') and request.ModelFile else None | ||
| self.model_path = request.ModelPath if hasattr(request, 'ModelPath') and request.ModelPath else None | ||
|
|
||
| from faster_qwen3_tts import FasterQwen3TTS | ||
| print(f"Loading model from: {model_path}", file=sys.stderr) | ||
| try: | ||
| self.model = FasterQwen3TTS.from_pretrained(model_path) |
There was a problem hiding this comment.
Loading an arbitrary model identifier from the request lets remote callers trigger untrusted model downloads or local path access, so restrict model_path to an allowlist or server-side configuration.
Suggested fix
allowed_models = {
"Qwen/Qwen3-TTS-12Hz-0.6B-Base",
}
model_path = request.Model or "Qwen/Qwen3-TTS-12Hz-0.6B-Base"
if model_path not in allowed_models:
return backend_pb2.Result(success=False, message="unsupported model")
...
self.model = FasterQwen3TTS.from_pretrained(model_path)Prompt for AI assistance
Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:
You are an expert python developer with deep knowledge of security, performance, and best practices.
### Context
File: backend/python/faster-qwen3-tts/backend.py
Lines: 64-72
Issue Type: security-high
Severity: high
Issue Description:
Loading an arbitrary model identifier from the request lets remote callers trigger untrusted model downloads or local path access, so restrict model_path to an allowlist or server-side configuration.
Current Code:
model_path = request.Model or "Qwen/Qwen3-TTS-12Hz-0.6B-Base"
...
self.model = FasterQwen3TTS.from_pretrained(model_path)
---
### Instructions
1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow python best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed
### Constraints
- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready
---
| --extra-index-url https://download.pytorch.org/whl/cu130 | ||
| torch | ||
| torchaudio | ||
| faster-qwen3-tts |
There was a problem hiding this comment.
Using an unpinned package name for faster-qwen3-tts allows a supply chain swap to a newer or compromised release, so pin it to an exact version.
| faster-qwen3-tts | |
| faster-qwen3-tts==<exact-version> |
Prompt for AI assistance
Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:
You are an expert text developer with deep knowledge of security, performance, and best practices.
### Context
File: backend/python/faster-qwen3-tts/requirements-cublas13.txt
Lines: 4-4
Issue Type: security-high
Severity: high
Issue Description:
Using an unpinned package name for `faster-qwen3-tts` allows a supply chain swap to a newer or compromised release, so pin it to an exact version.
Current Code:
faster-qwen3-tts
---
### Instructions
1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow text best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed
### Constraints
- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready
---
| --extra-index-url https://download.pytorch.org/whl/cu130 | ||
| torch | ||
| torchaudio | ||
| faster-qwen3-tts |
There was a problem hiding this comment.
Using an untrusted extra package index allows dependency confusion; pin the package to the official source or use --index-url if all packages must come from the PyTorch index.
Also reported at: backend/python/faster-qwen3-tts/requirements-l4t12.txt L1–L4
Suggested fix
--index-url https://download.pytorch.org/whl/cu130
torch==<pinned-version>
torchaudio==<pinned-version>
--extra-index-url https://pypi.org/simple
faster-qwen3-tts==<pinned-version>
Prompt for AI assistance
Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:
You are an expert text developer with deep knowledge of security, performance, and best practices.
### Context
File: backend/python/faster-qwen3-tts/requirements-l4t13.txt
Lines: 1-4
Issue Type: security-high
Severity: high
Issue Description:
Using an untrusted extra package index allows dependency confusion; pin the package to the official source or use --index-url if all packages must come from the PyTorch index.
_Also reported at: `backend/python/faster-qwen3-tts/requirements-l4t12.txt` L1–L4_
Current Code:
--extra-index-url https://download.pytorch.org/whl/cu130
torch
torchaudio
faster-qwen3-tts
---
### Instructions
1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow text best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed
### Constraints
- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready
---
| source $backend_dir/../common/libbackend.sh | ||
| fi | ||
|
|
||
| startBackend $@ |
There was a problem hiding this comment.
Using $@ unquoted changes argument boundaries, so call startBackend with "$@".
| startBackend $@ | |
| startBackend "$@" |
Prompt for AI assistance
Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:
You are an expert bash developer with deep knowledge of security, performance, and best practices.
### Context
File: backend/python/faster-qwen3-tts/run.sh
Lines: 9-9
Issue Type: functional-high
Severity: high
Issue Description:
Using `$@` unquoted changes argument boundaries, so call `startBackend` with `"$@"`.
Current Code:
startBackend $@
---
### Instructions
1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed
### Constraints
- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready
---
| backend_dir=$(dirname $0) | ||
| if [ -d $backend_dir/common ]; then | ||
| source $backend_dir/common/libbackend.sh | ||
| else | ||
| source $backend_dir/../common/libbackend.sh |
There was a problem hiding this comment.
Unquoted path expansions allow word splitting and option interpretation, so quote $0 and all $backend_dir uses.
Also reported at: backend/python/faster-qwen3-tts/install.sh L6–L10
Suggested fix
backend_dir=$(dirname "$0")
if [ -d "$backend_dir/common" ]; then
source "$backend_dir/common/libbackend.sh"
else
source "$backend_dir/../common/libbackend.sh"
fiPrompt for AI assistance
Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:
You are an expert bash developer with deep knowledge of security, performance, and best practices.
### Context
File: backend/python/faster-qwen3-tts/run.sh
Lines: 2-6
Issue Type: security-medium
Severity: medium
Issue Description:
Unquoted path expansions allow word splitting and option interpretation, so quote `$0` and all `$backend_dir` uses.
_Also reported at: `backend/python/faster-qwen3-tts/install.sh` L6–L10_
Current Code:
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
---
### Instructions
1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed
### Constraints
- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready
---
| @unittest.skipUnless( | ||
| __import__("torch").cuda.is_available(), | ||
| "faster-qwen3-tts TTS requires CUDA", |
There was a problem hiding this comment.
The skip condition imports torch at module load time and will crash the test file on environments without torch installed, so guard the import and default to skipping.
Suggested fix
try:
import torch
_HAS_CUDA = torch.cuda.is_available()
except Exception:
_HAS_CUDA = False
@unittest.skipUnless(
_HAS_CUDA,
"faster-qwen3-tts TTS requires CUDA",
)Prompt for AI assistance
Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:
You are an expert python developer with deep knowledge of security, performance, and best practices.
### Context
File: backend/python/faster-qwen3-tts/test.py
Lines: 52-54
Issue Type: functional-medium
Severity: medium
Issue Description:
The skip condition imports torch at module load time and will crash the test file on environments without torch installed, so guard the import and default to skipping.
Current Code:
@unittest.skipUnless(
__import__("torch").cuda.is_available(),
"faster-qwen3-tts TTS requires CUDA",
)
---
### Instructions
1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow python best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed
### Constraints
- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready
---
Security Scan Summary
No critical security issues detected Scan completed in 9.5sSecurity scan powered by Codity.ai |
Dependency vulnerability scanning
View vulnerability details (4 items)1. pip 24.0 CVE: GHSA-4xh5-x5gv-qwph
2. pip 24.0 CVE: GHSA-6vgw-5pg2-w6jp
3. pip 24.0 CVE: GHSA-58qw-9mgm-455v
4. pip 24.0 CVE: GHSA-jp4c-xjxw-mgf9
Powered by Codity.ai · Docs |
License Compliance Scan
Weak copyleft licenses found - verify compatibility Some packages have unknown licenses - manual review required Medium Risk Licenses - 4 packagesMPL-2.0 (4 packages):
Unknown Licenses - 73 packages
...and 53 more Powered by Codity.ai · Docs |
Code Quality Report — test-org-codity/LocalAI · PR #9Scanned: 2026-05-21 19:50 UTC | Score: 22/100 | Provider: github Executive Summary
Top Findings[CQ-LLM-001]
|
| File | Critical | High | Medium | Low | Total |
|---|---|---|---|---|---|
.github/workflows/backend.yml |
0 | 1 | 0 | 16 | 17 |
Makefile |
0 | 0 | 0 | 3 | 3 |
backend/index.yaml |
0 | 0 | 2 | 20 | 22 |
backend/python/faster-qwen3-tts/backend.py |
0 | 0 | 1 | 24 | 25 |
backend/python/faster-qwen3-tts/requirements.txt |
0 | 0 | 0 | 1 | 1 |
backend/python/faster-qwen3-tts/test.py |
0 | 0 | 1 | 49 | 50 |
Recommendations
- Resolve High severity issues, especially error handling gaps and performance bottlenecks.
- Run automated tests after applying fixes to verify no regressions.
|
@codity review |
PR SummaryWhat Changed
Key Changes by AreaBackend Implementation: New gRPC server with CI/CD: Added CUDA 12/13 builds for Build System: Added Files Changed
Review Focus Areas
ArchitectureDesign Decisions: Voice cloning only (no base voice generation). CUDA enforcement at runtime rather than build time allows single image with runtime detection. Platform-specific requirements files isolate Jetson wheel dependencies. Risks: Shell scripts have unquoted path variables (intentional? should be fixed). Merge StatusNOT MERGEABLE — PR Score 33/100, below threshold (50)
|
Workflow DiagramsAutomatically generated sequence diagrams showing the workflows in this PR 1. Workflow AnalysisComplex complexity • Components: faster-qwen3-tts gRPC backend, GitHub Actions CI/CD workflow, Docker build system sequenceDiagram
title: Faster Qwen3 TTS Backend Integration Workflow
participant Dev as Developer
participant GH as GitHub Actions
participant Docker as Docker Build
participant Make as Makefile
participant Backend as faster_qwen3_tts
participant LocalAI as LocalAI Core
participant HF as HuggingFace
Dev->>Make: Add faster-qwen3-tts to NOTPARALLEL list
Make->>Make: Register backend metadata<br/>BACKEND_FASTER_QWEN3_TTS
Dev->>GH: Push to faster-qwen3-tts branch
GH->>GH: Trigger backend.yml workflow
par CUDA 12 Build
GH->>Docker: Build cublas CUDA 12<br/>linux/amd64
Docker->>Backend: Install requirements-cublas12.txt
and CUDA 13 Build
GH->>Docker: Build cublas CUDA 13<br/>linux/amd64
Docker->>Backend: Install requirements-cublas13.txt
and L4T CUDA 12 Build
GH->>Docker: Build l4t CUDA 12<br/>linux/arm64
Docker->>Backend: Install requirements-l4t12.txt
and L4T CUDA 13 Build
GH->>Docker: Build l4t CUDA 13<br/>linux/arm64
Docker->>Backend: Install requirements-l4t13.txt
end
Docker->>Backend: Run install.sh
Backend->>HF: Download Qwen/Qwen3-TTS-12Hz-0.6B-Base
Note over Backend: CUDA graph capture enabledbr/Voice clone only mode
GH->>Docker: Tag and push images
Docker->>GH: Publish to quay.io/localai-backends
Dev->>Make: make docker-build-faster-qwen3-tts
Make->>Docker: Build local-ai-backend:faster-qwen3-tts
Docker->>Make: Save as backend-images/faster-qwen3-tts.tar
Make->>LocalAI: Install via ocifile protocol
LocalAI->>Backend: Initialize gRPC service
Backend->>Backend: Check torch.cuda.is_available
alt CUDA available
Backend->>Backend: Load model with CUDA graph
else CUDA not available
Backend->>Backend: Exit with error
end
LocalAI->>Backend: TTS request via backend_pb2
Backend->>Backend: Generate audio with soundfile
Backend->>LocalAI: Return audio data
Note over Backend,LocalAI: Backend registered inbr/backend/index.yaml forbr/multiple hardware variants
Note: Diagrams show detected patterns only. Complex workflows may require manual review. |
| key, value = opt.split(":", 1) | ||
| if is_float(value): | ||
| value = float(value) | ||
| elif is_int(value): | ||
| value = int(value) | ||
| elif value.lower() in ["true", "false"]: | ||
| value = value.lower() == "true" | ||
| self.options[key] = value | ||
|
|
||
| model_path = request.Model or "Qwen/Qwen3-TTS-12Hz-0.6B-Base" | ||
| self.audio_path = request.AudioPath if hasattr(request, 'AudioPath') and request.AudioPath else None |
There was a problem hiding this comment.
The option-parsing loop applies is_float before is_int. Since is_float succeeds for any valid integer string (e.g. "1" parses to 1.0), integer-valued options such as chunk_size will be stored as float instead of int. Later, int(chunk_size) is called defensively, but ref_text or other string options that happen to be purely numeric will also be silently mistyped. More critically, this means an integer option like chunk_size:512 becomes 512.0 (float), and if the upstream generate_voice_clone is strict about type, it may raise a TypeError at runtime.
Suggested fix
if is_int(value):
value = int(value)
elif is_float(value):
value = float(value)
elif value.lower() in ["true", "false"]:
value = value.lower() == "true"Prompt for AI assistance
Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:
You are an expert bash developer with deep knowledge of security, performance, and best practices.
### Context
File: backend/python/faster-qwen3-tts/backend.py
Lines: 55-65
Issue Type: functional-high
Severity: high
Issue Description:
The option-parsing loop applies `is_float` before `is_int`. Since `is_float` succeeds for any valid integer string (e.g. `"1"` parses to `1.0`), integer-valued options such as `chunk_size` will be stored as `float` instead of `int`. Later, `int(chunk_size)` is called defensively, but `ref_text` or other string options that happen to be purely numeric will also be silently mistyped. More critically, this means an integer option like `chunk_size:512` becomes `512.0` (float), and if the upstream `generate_voice_clone` is strict about type, it may raise a `TypeError` at runtime.
Current Code:
if is_float(value):
value = float(value)
elif is_int(value):
value = int(value)
elif value.lower() in ["true", "false"]:
value = value.lower() == "true"
---
### Instructions
1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed
### Constraints
- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready
---
| return backend_pb2.Result( | ||
| success=False, | ||
| message="dst (output path) is required" | ||
| ) | ||
| text = request.text.strip() | ||
| if not text: |
There was a problem hiding this comment.
If TTS is called before LoadModel (or if LoadModel failed after the CUDA check but before setting self.audio_path, self.model_file, self.model_path, or self.options), accessing self.audio_path / self.options etc. in _get_ref_audio_path and the rest of TTS will raise an AttributeError, crashing the gRPC handler and returning no structured error to the caller. The server continues running but that request is silently aborted. A guard check for model readiness is needed.
Suggested fix
def TTS(self, request, context):
try:
if not hasattr(self, 'model') or self.model is None:
return backend_pb2.Result(
success=False,
message="Model not loaded. Call LoadModel first."
)
if not request.dst:
return backend_pb2.Result(
success=False,
message="dst (output path) is required"
)Prompt for AI assistance
Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:
You are an expert bash developer with deep knowledge of security, performance, and best practices.
### Context
File: backend/python/faster-qwen3-tts/backend.py
Lines: 100-105
Issue Type: robustness-high
Severity: high
Issue Description:
If `TTS` is called before `LoadModel` (or if `LoadModel` failed after the CUDA check but before setting `self.audio_path`, `self.model_file`, `self.model_path`, or `self.options`), accessing `self.audio_path` / `self.options` etc. in `_get_ref_audio_path` and the rest of `TTS` will raise an `AttributeError`, crashing the gRPC handler and returning no structured error to the caller. The server continues running but that request is silently aborted. A guard check for model readiness is needed.
Current Code:
def TTS(self, request, context):
try:
if not request.dst:
return backend_pb2.Result(
success=False,
message="dst (output path) is required"
)
---
### Instructions
1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed
### Constraints
- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready
---
| audio_list, sr = self.model.generate_voice_clone( | ||
| text=text, | ||
| language=language, | ||
| ref_audio=ref_audio, |
There was a problem hiding this comment.
The output file path request.dst is passed directly to sf.write() without validation. A caller can write audio output to any filesystem location the process has access to (e.g. /etc/cron.d/backdoor.wav, overwriting existing files, or writing outside the intended temp directory). The path should be validated to fall within the expected output directory before use.
Prompt for AI assistance
Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:
You are an expert bash developer with deep knowledge of security, performance, and best practices.
### Context
File: backend/python/faster-qwen3-tts/backend.py
Lines: 138-138
Issue Type: security-medium
Severity: medium
Issue Description:
The output file path `request.dst` is passed directly to `sf.write()` without validation. A caller can write audio output to any filesystem location the process has access to (e.g. `/etc/cron.d/backdoor.wav`, overwriting existing files, or writing outside the intended temp directory). The path should be validated to fall within the expected output directory before use.
Current Code:
sf.write(request.dst, audio_data, sr)
---
### Instructions
1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed
### Constraints
- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready
---
| grpcio==1.71.0 | ||
| protobuf | ||
| certifi | ||
| packaging==24.1 | ||
| soundfile | ||
| setuptools | ||
| six | ||
| sox |
There was a problem hiding this comment.
All dependency versions in requirements.txt are unpinned except grpcio==1.71.0 and packaging==24.1. This means protobuf, certifi, soundfile, setuptools, six, and sox will resolve to whatever is latest at install time, potentially breaking the backend silently on a future pip install if a breaking release is published. The CUDA-specific requirements files also leave torch, torchaudio, and faster-qwen3-tts fully unpinned. For a production backend image, pinning is strongly recommended.
Prompt for AI assistance
Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:
You are an expert bash developer with deep knowledge of security, performance, and best practices.
### Context
File: backend/python/faster-qwen3-tts/requirements.txt
Lines: 1-8
Issue Type: maintainability-medium
Severity: medium
Issue Description:
All dependency versions in `requirements.txt` are unpinned except `grpcio==1.71.0` and `packaging==24.1`. This means `protobuf`, `certifi`, `soundfile`, `setuptools`, `six`, and `sox` will resolve to whatever is latest at install time, potentially breaking the backend silently on a future pip install if a breaking release is published. The CUDA-specific requirements files also leave `torch`, `torchaudio`, and `faster-qwen3-tts` fully unpinned. For a production backend image, pinning is strongly recommended.
Current Code:
grpcio==1.71.0
protobuf
certifi
packaging==24.1
soundfile
setuptools
six
sox
---
### Instructions
1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed
### Constraints
- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready
---
| cwd=os.path.dirname(os.path.abspath(__file__)), | ||
| ) |
There was a problem hiding this comment.
The test setUp unconditionally sleeps 15 seconds waiting for the backend process to start. If the process fails immediately (e.g. import error, port conflict), the test suite still waits the full 15 seconds and then gets cryptic gRPC connection-refused errors rather than a clear failure. A readiness poll with a short sleep-and-retry loop would both speed up the common case and give a clearer failure message.
Prompt for AI assistance
Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:
You are an expert bash developer with deep knowledge of security, performance, and best practices.
### Context
File: backend/python/faster-qwen3-tts/test.py
Lines: 22-23
Issue Type: robustness-low
Severity: low
Issue Description:
The test `setUp` unconditionally sleeps 15 seconds waiting for the backend process to start. If the process fails immediately (e.g. import error, port conflict), the test suite still waits the full 15 seconds and then gets cryptic gRPC connection-refused errors rather than a clear failure. A readiness poll with a short sleep-and-retry loop would both speed up the common case and give a clearer failure message.
Current Code:
self.service = subprocess.Popen(
["python3", "backend.py", "--addr", "localhost:50052"],
---
### Instructions
1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed
### Constraints
- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready
---
| CUDA=True, | ||
| ), | ||
| timeout=10.0, | ||
| ) | ||
| self.assertFalse(response.success) | ||
|
|
||
| @unittest.skipUnless( | ||
| __import__("torch").cuda.is_available(), | ||
| "faster-qwen3-tts TTS requires CUDA", | ||
| ) | ||
| def test_tts(self): | ||
| import soundfile as sf |
There was a problem hiding this comment.
test_load_model_requires_cuda asserts self.assertFalse(response.success) unconditionally. On a CI runner that does have a GPU, LoadModel will succeed and the assertion will fail, breaking the test suite on GPU-enabled hosts. The test intent is to confirm behaviour when CUDA is absent; it should be skipped when CUDA is available.
Suggested fix
@unittest.skipIf(
__import__("torch").cuda.is_available(),
"Skipping CUDA-required test on GPU-enabled host",
)
def test_load_model_requires_cuda(self):
with grpc.insecure_channel("localhost:50052") as channel:
stub = backend_pb2_grpc.BackendStub(channel)
response = stub.LoadModel(
backend_pb2.ModelOptions(
Model="Qwen/Qwen3-TTS-12Hz-0.6B-Base",
CUDA=True,
),
timeout=10.0,
)
self.assertFalse(response.success)Prompt for AI assistance
Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:
You are an expert bash developer with deep knowledge of security, performance, and best practices.
### Context
File: backend/python/faster-qwen3-tts/test.py
Lines: 46-57
Issue Type: functional-medium
Severity: medium
Issue Description:
`test_load_model_requires_cuda` asserts `self.assertFalse(response.success)` unconditionally. On a CI runner that *does* have a GPU, `LoadModel` will succeed and the assertion will fail, breaking the test suite on GPU-enabled hosts. The test intent is to confirm behaviour when CUDA is *absent*; it should be skipped when CUDA is available.
Current Code:
def test_load_model_requires_cuda(self):
with grpc.insecure_channel("localhost:50052") as channel:
stub = backend_pb2_grpc.BackendStub(channel)
response = stub.LoadModel(
backend_pb2.ModelOptions(
Model="Qwen/Qwen3-TTS-12Hz-0.6B-Base",
CUDA=True,
),
timeout=10.0,
)
self.assertFalse(response.success)
---
### Instructions
1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed
### Constraints
- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready
---
| uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-qwen-tts" | ||
| mirrors: | ||
| - localai/localai-backends:master-metal-darwin-arm64-qwen-tts | ||
| ## faster-qwen3-tts | ||
| - !!merge <<: *faster-qwen3-tts |
There was a problem hiding this comment.
The faster-qwen3-tts-development entry (the first merged block after the anchor) overrides capabilities but does NOT set a uri or mirrors. This means the development variant has no image URI; when LocalAI tries to resolve and pull this backend image for a development/master install, it will find no image reference and fail silently or with a confusing error. The qwen-tts-development entry and others in the file all carry explicit uri + mirrors pointing to master-* tagged images.
Suggested fix
- !!merge <<: *faster-qwen3-tts
name: "faster-qwen3-tts-development"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-faster-qwen3-tts"
mirrors:
- localai/localai-backends:master-gpu-nvidia-cuda-12-faster-qwen3-tts
capabilities:
nvidia: "cuda12-faster-qwen3-tts-development"
default: "cuda12-faster-qwen3-tts-development"
nvidia-cuda-13: "cuda13-faster-qwen3-tts-development"
nvidia-cuda-12: "cuda12-faster-qwen3-tts-development"
nvidia-l4t: "nvidia-l4t-faster-qwen3-tts-development"
nvidia-l4t-cuda-12: "nvidia-l4t-faster-qwen3-tts-development"
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-faster-qwen3-tts-development"Prompt for AI assistance
Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:
You are an expert bash developer with deep knowledge of security, performance, and best practices.
### Context
File: backend/index.yaml
Lines: 2301-2305
Issue Type: functional-medium
Severity: medium
Issue Description:
The `faster-qwen3-tts-development` entry (the first merged block after the anchor) overrides `capabilities` but does NOT set a `uri` or `mirrors`. This means the development variant has no image URI; when LocalAI tries to resolve and pull this backend image for a development/master install, it will find no image reference and fail silently or with a confusing error. The `qwen-tts-development` entry and others in the file all carry explicit `uri` + `mirrors` pointing to `master-*` tagged images.
Current Code:
- !!merge <<: *faster-qwen3-tts
name: "faster-qwen3-tts-development"
capabilities:
nvidia: "cuda12-faster-qwen3-tts-development"
default: "cuda12-faster-qwen3-tts-development"
nvidia-cuda-13: "cuda13-faster-qwen3-tts-development"
nvidia-cuda-12: "cuda12-faster-qwen3-tts-development"
nvidia-l4t: "nvidia-l4t-faster-qwen3-tts-development"
nvidia-l4t-cuda-12: "nvidia-l4t-faster-qwen3-tts-development"
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-faster-qwen3-tts-development"
---
### Instructions
1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed
### Constraints
- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready
---
Security Scan Summary
No critical security issues detected Scan completed in 15.1sSecurity scan powered by Codity.ai |
License Compliance Scan
Weak copyleft licenses found - verify compatibility Some packages have unknown licenses - manual review required Medium Risk Licenses - 4 packagesMPL-2.0 (4 packages):
Unknown Licenses - 62 packages
...and 42 more Powered by Codity.ai · Docs |
Code Quality Report — test-org-codity/LocalAI · PR #9Scanned: 2026-06-03 23:26 UTC | Score: 22/100 | Provider: github Executive Summary
Top Findings[CQ-LLM-001]
|
| File | Critical | High | Medium | Low | Total |
|---|---|---|---|---|---|
.github/workflows/backend.yml |
0 | 1 | 0 | 17 | 18 |
Makefile |
0 | 0 | 0 | 2 | 2 |
backend/index.yaml |
0 | 0 | 2 | 20 | 22 |
backend/python/faster-qwen3-tts/backend.py |
0 | 0 | 1 | 24 | 25 |
backend/python/faster-qwen3-tts/requirements.txt |
0 | 0 | 0 | 1 | 1 |
backend/python/faster-qwen3-tts/test.py |
0 | 0 | 1 | 49 | 50 |
Recommendations
- Resolve High severity issues, especially error handling gaps and performance bottlenecks.
- Run automated tests after applying fixes to verify no regressions.
Description
This PR fixes #
Notes for Reviewers
Signed commits