Faster qwen3 tts by DhirenMhatre · Pull Request #9 · test-org-codity/LocalAI

DhirenMhatre · 2026-05-21T19:41:28Z

Description

This PR fixes #

Notes for Reviewers

Signed commits

Yes, I signed my commits.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

DhirenMhatre · 2026-05-21T19:44:02Z

@codity review

codity-ai · 2026-05-21T19:44:10Z

PR Summary

What Changed

Adds a new faster-qwen3-tts gRPC backend for real-time Qwen3-TTS voice cloning with CUDA graph capture.
Expands CI/CD to build CUDA 12/13 variants for AMD64 and NVIDIA L4T (ARM64) platforms.
Registers 8 backend variants in the index with development and latest tags.

Key Changes by Area

Backend Implementation: New gRPC server at backend/python/faster-qwen3-tts/ with LoadModel and TTS methods, voice cloning via reference audio/text.

Build System: Added to .NOTPARALLEL, prepare-test-extra, test-extra, and docker-build-backends Makefile targets.

CI/CD: New CUDA 12/13 builds for linux/amd64 and linux/arm64 (L4T). Also expanded voxcpm CPU builds to ARM64.

Files Changed

File	Changes Summary
.github/workflows/backend.yml	Added CUDA 12/13 builds for AMD64 and L4T ARM64; expanded voxcpm platforms
Makefile	Added faster-qwen3-tts to parallelization guards and test/build targets
backend/index.yaml	Registered faster-qwen3-tts with 8 variant entries
backend/python/faster-qwen3-tts/Makefile	Build automation for the new backend
backend/python/faster-qwen3-tts/backend.py	gRPC server implementation with LoadModel and TTS methods
backend/python/faster-qwen3-tts/install.sh	Installation script
backend/python/faster-qwen3-tts/requirements-cublas12.txt	CUDA 12 dependencies
backend/python/faster-qwen3-tts/requirements-cublas13.txt	CUDA 13 dependencies
backend/python/faster-qwen3-tts/requirements-l4t12.txt	Jetson L4T 12 with CUDA 12.9 PyTorch wheels
backend/python/faster-qwen3-tts/requirements-l4t13.txt	Jetson L4T 13 with CUDA 13.0 PyTorch wheels
backend/python/faster-qwen3-tts/requirements.txt	Base dependencies (grpcio, protobuf, soundfile, sox)
backend/python/faster-qwen3-tts/run.sh	Standard backend wrapper script
backend/python/faster-qwen3-tts/test.py	Unit tests with CUDA gating and end-to-end TTS test
backend/python/faster-qwen3-tts/test.sh	Test runner with unquoted path expansions

Review Focus Areas

Path quoting in backend/python/faster-qwen3-tts/test.sh:4-8 (unquoted expansions break on spaces/globs).
CUDA graph capture error handling in backend.py for edge cases on different GPU architectures.
L4T wheel compatibility: verify PyTorch CUDA 12.9/13.0 wheels work across Jetson variants.

Architecture

Design Decisions: Voice-clone only (no base voice synthesis) keeps scope narrow. CUDA graph capture trades flexibility for latency. Separate requirements files per CUDA version and platform (L4T vs desktop) manage dependency complexity explicitly.

Scalability & Extensibility: Backend follows existing gRPC pattern. 8 variants in index.yaml allow version pinning but increase maintenance surface. Out of scope: CPU fallback or quantization.

Risks: Intentional: NVIDIA-only, no CPU support. Unintentional: L4T wheel URLs may rot; test coverage gaps for actual hardware (tests gate on CUDA availability).

Merge Status

NOT MERGEABLE — PR Score 13/100, below threshold (50)

[H4] PR quality score (13) is below merge floor (50)
[H5] 7 HIGH-severity inline review findings need resolution (threshold: 3)
[H6] Code quality raw score (22) is below merge floor (40)

codity-ai · 2026-05-21T19:46:07Z

Workflow Diagrams

Automatically generated sequence diagrams showing the workflows in this PR

1. ## Analysis

This PR introduces a new TTS (Text-to-Speech...

Complex complexity • Components: faster-qwen3-tts gRPC backend service, GitHub Actions CI/CD workflow, Docker build matrix

sequenceDiagram
    title: faster-qwen3-tts Backend Integration Workflow

    participant Dev as Developer
    participant GH as GitHub Actions CI
    participant MK as Makefile
    participant DK as Docker Build
    participant BI as Backend Index
    participant LA as LocalAI Runtime

    Note over Dev,LA: PR introduces faster-qwen3-tts TTS backend with multi-platform CUDA support

    Dev->>MK: Add BACKEND_FASTER_QWEN3_TTS metadata<br/>Makefile:2/3 chunk
    MK-->>Dev: Register python backend with root context

    Dev->>GH: Push changes triggering backend.yml workflow

    par Parallel Build Matrix
        GH->>DK: Build CUDA 12 amd64<br/>backend.yml:210+
        Note right of DK: cublas, cuda 12.8, ubuntu 24.04
        DK-->>GH: Image: local-ai-backends:latest-gpu-nvidia-cuda-12-faster-qwen3-tts

        GH->>DK: Build CUDA 13 amd64<br/>backend.yml:588+
        Note right of DK: cublas, cuda 13.0, ubuntu 24.04
        DK-->>GH: Image: local-ai-backends:latest-gpu-nvidia-cuda-13-faster-qwen3-tts

        GH->>DK: Build L4T CUDA 12 arm64<br/>backend.yml:1345+
        Note right of DK: NVIDIA Jetson, arm64, ubuntu 22.04
        DK-->>GH: Image: local-ai-backends:latest-nvidia-l4t-faster-qwen3-tts

        GH->>DK: Build L4T CUDA 13 arm64<br/>backend.yml:731+
        Note right of DK: NVIDIA Jetson, arm64, ubuntu 24.04
        DK-->>GH: Image: local-ai-backends:latest-nvidia-l4t-cuda-13-arm64-faster-qwen3-tts
    end

    GH->>BI: Register backend variants<br/>backend/index.yaml
    Note right of BI: YAML anchors define image URIsbr/for runtime discovery

    LA->>BI: Query available backends at startup
    BI-->>LA: Return faster-qwen3-tts image URIs<br/>for detected platform

    alt CUDA GPU available
        LA->>DK: Pull and run faster-qwen3-tts container
        DK-->>LA: gRPC service ready on configured port
        LA->>DK: TTS request via gRPC
        DK-->>LA: Audio output stream
    else CPU only or CUDA unavailable
        LA->>LA: Skip faster-qwen3-tts registration
        Note right of LA: Backend requires NVIDIA GPUbr/Graceful degradation to CPU TTS
    end

Note: Diagrams show detected patterns only. Complex workflows may require manual review.

codity-ai · 2026-05-21T19:47:53Z

+  capabilities:
+    nvidia: "cuda12-faster-qwen3-tts"
+    default: "cuda12-faster-qwen3-tts"
+    nvidia-cuda-13: "cuda13-faster-qwen3-tts"


Selecting the new backend with capability default on a CPU-only host now resolves to cuda12-faster-qwen3-tts, so a request like backend: faster-qwen3-tts without NVIDIA hardware will always pull a CUDA image and fail at runtime because backend.py rejects torch.cuda.is_available()==False.

Remove the default capability for this backend or point it to no image/CPU-safe fallback so generic backend selection does not choose a CUDA-only container on non-NVIDIA machines.

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert yaml developer with deep knowledge of security, performance, and best practices. ### Context File: backend/index.yaml Lines: 544-547 Issue Type: functional-high Severity: high Issue Description: Selecting the new backend with capability `default` on a CPU-only host now resolves to `cuda12-faster-qwen3-tts`, so a request like `backend: faster-qwen3-tts` without NVIDIA hardware will always pull a CUDA image and fail at runtime because `backend.py` rejects `torch.cuda.is_available()==False`. Current Code: capabilities: nvidia: "cuda12-faster-qwen3-tts" default: "cuda12-faster-qwen3-tts" nvidia-cuda-13: "cuda13-faster-qwen3-tts" nvidia-cuda-12: "cuda12-faster-qwen3-tts" --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow yaml best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-ai · 2026-05-21T19:47:53Z

+            if not request.dst:
+                return backend_pb2.Result(
+                    success=False,
+                    message="dst (output path) is required"
+                )
+            text = request.text.strip()
+            if not text:
+                return backend_pb2.Result(
+                    success=False,
+                    message="Text is empty"
+                )
+
+            language = request.language if hasattr(request, 'language') and request.language else None
+            if not language or language == "":
+                language = "English"
+
+            ref_audio = self._get_ref_audio_path(request)
+            if not ref_audio:
+                return backend_pb2.Result(
+                    success=False,
+                    message="AudioPath is required for voice clone (set in LoadModel)"
+                )
+            ref_text = self.options.get("ref_text")
+            if not ref_text and hasattr(request, 'ref_text') and request.ref_text:
+                ref_text = request.ref_text
+            if not ref_text:
+                return backend_pb2.Result(
+                    success=False,
+                    message="ref_text is required for voice clone (set via LoadModel Options, e.g. ref_text:Your reference transcript)"
+                )
+
+            chunk_size = self.options.get("chunk_size")
+            generation_kwargs = {}
+            if chunk_size is not None:
+                generation_kwargs["chunk_size"] = int(chunk_size)
+
+            audio_list, sr = self.model.generate_voice_clone(
+                text=text,
+                language=language,
+                ref_audio=ref_audio,
+                ref_text=ref_text,
+                **generation_kwargs
+            )
+
+            if audio_list is None or (isinstance(audio_list, list) and len(audio_list) == 0):
+                return backend_pb2.Result(
+                    success=False,
+                    message="No audio output generated"
+                )
+            audio_data = audio_list[0] if isinstance(audio_list, list) else audio_list
+            sf.write(request.dst, audio_data, sr)


The server writes to a client-controlled filesystem path without validation, so restrict dst to an allowed output directory and reject absolute or traversing paths.

Suggested fix

if not request.dst: return backend_pb2.Result( success=False, message="dst (output path) is required" ) output_base = os.environ.get("LOCALAI_OUTPUT_DIR", "/tmp/localai") requested_path = os.path.normpath(request.dst) if os.path.isabs(requested_path): return backend_pb2.Result(success=False, message="absolute dst paths are not allowed") final_path = os.path.normpath(os.path.join(output_base, requested_path)) if not final_path.startswith(os.path.abspath(output_base) + os.sep): return backend_pb2.Result(success=False, message="invalid dst path") ... sf.write(final_path, audio_data, sr)

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert python developer with deep knowledge of security, performance, and best practices. ### Context File: backend/python/faster-qwen3-tts/backend.py Lines: 99-149 Issue Type: security-high Severity: high Issue Description: The server writes to a client-controlled filesystem path without validation, so restrict dst to an allowed output directory and reject absolute or traversing paths. Current Code: if not request.dst: return backend_pb2.Result( success=False, message="dst (output path) is required" ) ... sf.write(request.dst, audio_data, sr) --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow python best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-ai · 2026-05-21T19:47:53Z

+        if not self.audio_path:
+            return None
+        if os.path.isabs(self.audio_path):
+            return self.audio_path
+        if self.model_file:
+            model_file_base = os.path.dirname(self.model_file)
+            ref_path = os.path.join(model_file_base, self.audio_path)
+            if os.path.exists(ref_path):
+                return ref_path
+        if self.model_path:
+            ref_path = os.path.join(self.model_path, self.audio_path)
+            if os.path.exists(ref_path):
+                return ref_path
+        return self.audio_path


AudioPath is accepted as an absolute path and otherwise returned unchecked, so validate it against an allowlisted base directory before opening local files.

Suggested fix

def _get_ref_audio_path(self, request): if not self.audio_path: return None allowed_bases = [] if self.model_file: allowed_bases.append(os.path.abspath(os.path.dirname(self.model_file))) if self.model_path: allowed_bases.append(os.path.abspath(self.model_path)) for base in allowed_bases: candidate = os.path.abspath(os.path.join(base, self.audio_path)) if candidate.startswith(base + os.sep) and os.path.exists(candidate): return candidate return None

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert python developer with deep knowledge of security, performance, and best practices. ### Context File: backend/python/faster-qwen3-tts/backend.py Lines: 82-95 Issue Type: security-high Severity: high Issue Description: AudioPath is accepted as an absolute path and otherwise returned unchecked, so validate it against an allowlisted base directory before opening local files. Current Code: def _get_ref_audio_path(self, request): if not self.audio_path: return None if os.path.isabs(self.audio_path): return self.audio_path if self.model_file: model_file_base = os.path.dirname(self.model_file) ref_path = os.path.join(model_file_base, self.audio_path) if os.path.exists(ref_path): return ref_path if self.model_path: ref_path = os.path.join(self.model_path, self.audio_path) if os.path.exists(ref_path): return ref_path return self.audio_path --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow python best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-ai · 2026-05-21T19:47:53Z

+        model_path = request.Model or "Qwen/Qwen3-TTS-12Hz-0.6B-Base"
+        self.audio_path = request.AudioPath if hasattr(request, 'AudioPath') and request.AudioPath else None
+        self.model_file = request.ModelFile if hasattr(request, 'ModelFile') and request.ModelFile else None
+        self.model_path = request.ModelPath if hasattr(request, 'ModelPath') and request.ModelPath else None
+
+        from faster_qwen3_tts import FasterQwen3TTS
+        print(f"Loading model from: {model_path}", file=sys.stderr)
+        try:
+            self.model = FasterQwen3TTS.from_pretrained(model_path)


Loading an arbitrary model identifier from the request lets remote callers trigger untrusted model downloads or local path access, so restrict model_path to an allowlist or server-side configuration.

Suggested fix

allowed_models = { "Qwen/Qwen3-TTS-12Hz-0.6B-Base", } model_path = request.Model or "Qwen/Qwen3-TTS-12Hz-0.6B-Base" if model_path not in allowed_models: return backend_pb2.Result(success=False, message="unsupported model") ... self.model = FasterQwen3TTS.from_pretrained(model_path)

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert python developer with deep knowledge of security, performance, and best practices. ### Context File: backend/python/faster-qwen3-tts/backend.py Lines: 64-72 Issue Type: security-high Severity: high Issue Description: Loading an arbitrary model identifier from the request lets remote callers trigger untrusted model downloads or local path access, so restrict model_path to an allowlist or server-side configuration. Current Code: model_path = request.Model or "Qwen/Qwen3-TTS-12Hz-0.6B-Base" ... self.model = FasterQwen3TTS.from_pretrained(model_path) --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow python best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-ai · 2026-05-21T19:47:53Z

+--extra-index-url https://download.pytorch.org/whl/cu130
+torch
+torchaudio
+faster-qwen3-tts


Using an unpinned package name for faster-qwen3-tts allows a supply chain swap to a newer or compromised release, so pin it to an exact version.

Suggested change

faster-qwen3-tts

faster-qwen3-tts==<exact-version>

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert text developer with deep knowledge of security, performance, and best practices. ### Context File: backend/python/faster-qwen3-tts/requirements-cublas13.txt Lines: 4-4 Issue Type: security-high Severity: high Issue Description: Using an unpinned package name for `faster-qwen3-tts` allows a supply chain swap to a newer or compromised release, so pin it to an exact version. Current Code: faster-qwen3-tts --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow text best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-ai · 2026-05-21T19:47:53Z

+--extra-index-url https://download.pytorch.org/whl/cu130
+torch
+torchaudio
+faster-qwen3-tts


Using an untrusted extra package index allows dependency confusion; pin the package to the official source or use --index-url if all packages must come from the PyTorch index.

Also reported at: backend/python/faster-qwen3-tts/requirements-l4t12.txt L1–L4

Suggested fix

--index-url https://download.pytorch.org/whl/cu130 torch==<pinned-version> torchaudio==<pinned-version> --extra-index-url https://pypi.org/simple faster-qwen3-tts==<pinned-version>

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert text developer with deep knowledge of security, performance, and best practices. ### Context File: backend/python/faster-qwen3-tts/requirements-l4t13.txt Lines: 1-4 Issue Type: security-high Severity: high Issue Description: Using an untrusted extra package index allows dependency confusion; pin the package to the official source or use --index-url if all packages must come from the PyTorch index. _Also reported at: `backend/python/faster-qwen3-tts/requirements-l4t12.txt` L1–L4_ Current Code: --extra-index-url https://download.pytorch.org/whl/cu130 torch torchaudio faster-qwen3-tts --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow text best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-ai · 2026-05-21T19:47:53Z

+    source $backend_dir/../common/libbackend.sh
+fi
+
+startBackend $@


Using $@ unquoted changes argument boundaries, so call startBackend with "$@".

Suggested change

startBackend $@

startBackend "$@"

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices. ### Context File: backend/python/faster-qwen3-tts/run.sh Lines: 9-9 Issue Type: functional-high Severity: high Issue Description: Using `$@` unquoted changes argument boundaries, so call `startBackend` with `"$@"`. Current Code: startBackend $@ --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow bash best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-ai · 2026-05-21T19:47:53Z

+backend_dir=$(dirname $0)
+if [ -d $backend_dir/common ]; then
+    source $backend_dir/common/libbackend.sh
+else
+    source $backend_dir/../common/libbackend.sh


Unquoted path expansions allow word splitting and option interpretation, so quote $0 and all $backend_dir uses.

Also reported at: backend/python/faster-qwen3-tts/install.sh L6–L10

Suggested fix

backend_dir=$(dirname "$0") if [ -d "$backend_dir/common" ]; then source "$backend_dir/common/libbackend.sh" else source "$backend_dir/../common/libbackend.sh" fi

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices. ### Context File: backend/python/faster-qwen3-tts/run.sh Lines: 2-6 Issue Type: security-medium Severity: medium Issue Description: Unquoted path expansions allow word splitting and option interpretation, so quote `$0` and all `$backend_dir` uses. _Also reported at: `backend/python/faster-qwen3-tts/install.sh` L6–L10_ Current Code: backend_dir=$(dirname $0) if [ -d $backend_dir/common ]; then source $backend_dir/common/libbackend.sh else source $backend_dir/../common/libbackend.sh fi --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow bash best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-ai · 2026-05-21T19:47:53Z

+    @unittest.skipUnless(
+        __import__("torch").cuda.is_available(),
+        "faster-qwen3-tts TTS requires CUDA",


The skip condition imports torch at module load time and will crash the test file on environments without torch installed, so guard the import and default to skipping.

Suggested fix

try: import torch _HAS_CUDA = torch.cuda.is_available() except Exception: _HAS_CUDA = False @unittest.skipUnless( _HAS_CUDA, "faster-qwen3-tts TTS requires CUDA", )

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert python developer with deep knowledge of security, performance, and best practices. ### Context File: backend/python/faster-qwen3-tts/test.py Lines: 52-54 Issue Type: functional-medium Severity: medium Issue Description: The skip condition imports torch at module load time and will crash the test file on environments without torch installed, so guard the import and default to skipping. Current Code: @unittest.skipUnless( __import__("torch").cuda.is_available(), "faster-qwen3-tts TTS requires CUDA", ) --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow python best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-ai · 2026-05-21T19:47:56Z

Nitpicks (Low Priority)

Found 1 low-priority suggestions for code improvement

Click to expand nitpicks

`backend/python/faster-qwen3-tts/test.sh` (lines 4-8)

The path expansions are unquoted, so spaces or glob characters in the script path can break the directory test and source command; quote the variable expansions.

Code Suggestion or Comments

backend_dir=$(dirname "$0")
if [ -d "$backend_dir/common" ]; then
    source "$backend_dir/common/libbackend.sh"
else
    source "$backend_dir/../common/libbackend.sh"
fi

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/python/faster-qwen3-tts/test.sh
Lines: 4-8
Issue Type: robustness-low
Severity: low

Issue Description:
The path expansions are unquoted, so spaces or glob characters in the script path can break the directory test and source command; quote the variable expansions.

Current Code:
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
    source $backend_dir/common/libbackend.sh
else
    source $backend_dir/../common/libbackend.sh
fi

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---

codity-ai · 2026-05-21T19:48:48Z

Security Scan Summary

Metric	Value
Vulnerabilities	Critical: 0
Overall Risk	Clean
Files Scanned	14

No critical security issues detected

Scan completed in 9.5s

Security scan powered by Codity.ai

codity-ai · 2026-05-21T19:49:30Z

Dependency vulnerability scanning

Metric	Value
Vulnerabilities Found	4
Scanner	pip-audit

View vulnerability details (4 items)

1. pip 24.0

CVE: GHSA-4xh5-x5gv-qwph
Fixed in: 25.3

When extracting a tar archive pip may not check symbolic links point into the extraction directory if the tarfile module doesn't implement PEP 706. Note that upgrading pip to a "fixed" version for thi

2. pip 24.0

CVE: GHSA-6vgw-5pg2-w6jp
Fixed in: 26.0

When pip is installing and extracting a maliciously crafted wheel archive, files may be extracted outside the installation directory. The path traversal is limited to prefixes of the installation dire

3. pip 24.0

CVE: GHSA-58qw-9mgm-455v
Fixed in: 26.1

pip handles concatenated tar and ZIP files as ZIP files regardless of filename or whether a file is both a tar and ZIP file. This behavior could result in confusing installation behavior, such as inst

4. pip 24.0

CVE: GHSA-jp4c-xjxw-mgf9
Fixed in: 26.1

pip prior to version 26.1 would run self-update check functionality after installing wheel files which required importing well-known Python modules names. These module imports were intentionally defer

Powered by Codity.ai · Docs

codity-ai · 2026-05-21T19:49:54Z

License Compliance Scan

Metric	Value
Packages Scanned	337
High Risk (Strong Copyleft)	0
Medium Risk (Weak Copyleft)	4
Low Risk (Permissive)	260
Unknown License	73

Weak copyleft licenses found - verify compatibility

Some packages have unknown licenses - manual review required

Medium Risk Licenses - 4 packages

MPL-2.0 (4 packages):

github.com/libp2p/go-yamux/v5 5.0.1
github.com/hashicorp/golang-lru 1.0.2
github.com/hashicorp/golang-lru/v2 2.0.7
github.com/shoenig/go-m1cpu 0.1.6

Unknown Licenses - 73 packages

github.com/docker/docker-credential-helpers 0.9.3
github.com/docker/cli 29.0.3+incompatible
github.com/docker/go-connections 0.6.0
github.com/docker/go-units 0.5.0
github.com/dsnet/compress 0.0.2-0.20210315054119-f66993602bf5
github.com/flynn/noise 1.1.0
github.com/go-audio/audio 1.0.0
github.com/go-logr/logr 1.4.3
github.com/go-audio/riff 1.0.0
github.com/go-logr/stdr 1.2.2
github.com/go-openapi/jsonpointer 0.21.0
github.com/go-openapi/swag 0.23.0
github.com/go-openapi/jsonreference 0.21.0
github.com/gogo/protobuf 1.3.2
github.com/go-openapi/spec 0.21.0
github.com/golang/groupcache 0.0.0-20210331224755-41bb18bfe9da
github.com/golang/snappy 0.0.4
github.com/google/btree 1.1.3
github.com/google/go-cmp 0.7.0
github.com/google/pprof 0.0.0-20260115054156-294ebfa9ad83

...and 53 more

Powered by Codity.ai · Docs

codity-ai · 2026-05-21T19:50:05Z

Code Quality Report — test-org-codity/LocalAI · PR #9

Scanned: 2026-05-21 19:50 UTC | Score: 22/100 | Provider: github

Executive Summary

Severity	Count
Critical	0
High	1
Medium	4
Low	113

Top Findings

[CQ-LLM-001] `.github/workflows/backend.yml:211` (Duplication · HIGH)

Issue: Repeated configuration blocks for 'cublas' and 'l4t' build types with similar parameters.
Suggestion: Consolidate similar build configurations into a single template to adhere to DRY principles.

- build-type: 'cublas'
  cuda-major-version: "12"
  cuda-minor-version: "8"
  platforms: 'linux/amd64'
  tag-latest: 'auto'
  tag-suffix: '-gpu-nvidia-cuda-12-faster-qwen3-tts'
  runs-on: 'ubuntu-latest'
  base-image: "ubuntu:24.04"
  skip-drivers: 'false'
  backend: "faster-qwen3-tts"
  dockerfile: "./backend/Dockerfile.python"
  context: "./"
  ubuntu-version: '2404'

[CQ-LLM-002] `backend/index.yaml:529` (Maintainability · MEDIUM)

Issue: Magic strings used for backend names and URLs without explanation.
Suggestion: Define constants for backend names and URLs to improve maintainability and readability.

name: "faster-qwen3-tts"

[CQ-LLM-003] `backend/index.yaml:529` (Documentation · MEDIUM)

Issue: Missing documentation for the new 'faster-qwen3-tts' configuration.
Suggestion: Add comments or docstrings explaining the purpose and usage of the 'faster-qwen3-tts' configuration.

description: |
  Real-time Qwen3-TTS inference using CUDA graph capture. Voice clone only; requires NVIDIA GPU with CUDA.

[CQ-012] `backend/python/faster-qwen3-tts/backend.py:184` (Performance · MEDIUM)

Issue: Synchronous I/O call may block the event loop in async context
Suggestion: Use async alternatives (aiofiles, httpx, asyncio.sleep)

time.sleep(_ONE_DAY_IN_SECONDS)

[CQ-012] `backend/python/faster-qwen3-tts/test.py:24` (Performance · MEDIUM)

Issue: Synchronous I/O call may block the event loop in async context
Suggestion: Use async alternatives (aiofiles, httpx, asyncio.sleep)

time.sleep(15)

[CQ-008] `.github/workflows/backend.yml:214` (Maintainability · LOW)

Issue: Magic number 12 in code
Suggestion: Extract to a named constant

cuda-major-version: "12"

[CQ-008] `.github/workflows/backend.yml:218` (Maintainability · LOW)

Issue: Magic number 12 in code
Suggestion: Extract to a named constant

tag-suffix: '-gpu-nvidia-cuda-12-faster-qwen3-tts'

[CQ-008] `.github/workflows/backend.yml:220` (Maintainability · LOW)

Issue: Magic number 24 in code
Suggestion: Extract to a named constant

base-image: "ubuntu:24.04"

[CQ-008] `.github/workflows/backend.yml:225` (Maintainability · LOW)

Issue: Magic number 2404 in code
Suggestion: Extract to a named constant

ubuntu-version: '2404'

[CQ-008] `.github/workflows/backend.yml:592` (Maintainability · LOW)

Issue: Magic number 13 in code
Suggestion: Extract to a named constant

cuda-major-version: "13"

Per-File Breakdown

File	High	Medium	Low	Total
`.github/workflows/backend.yml`	1	0	16	17
`Makefile`	0	0	3	3
`backend/index.yaml`	0	2	20	22
`backend/python/faster-qwen3-tts/backend.py`	0	1	24	25
`backend/python/faster-qwen3-tts/requirements.txt`	0	0	1	1
`backend/python/faster-qwen3-tts/test.py`	0	1	49	50

Recommendations

Resolve High severity issues, especially error handling gaps and performance bottlenecks.

Run automated tests after applying fixes to verify no regressions.

chay2199 · 2026-06-03T23:19:48Z

@codity review

codity-chait · 2026-06-03T23:20:03Z

PR Summary

What Changed

Adds a new faster-qwen3-tts gRPC backend for real-time Qwen3-TTS voice cloning with CUDA graph capture.
Integrates CUDA 12/13 builds for x86_64 and Jetson L4T (ARM64) into CI/CD pipeline.
Registers 9 backend variants in the index with platform-specific dependencies.

Key Changes by Area

Backend Implementation: New gRPC server with LoadModel and TTS methods, enforces CUDA at runtime, supports voice cloning via reference audio/text.

CI/CD: Added CUDA 12/13 builds for linux/amd64 and NVIDIA L4T builds for linux/arm64. Fixed voxcpm CPU build to include ARM64.

Build System: Added faster-qwen3-tts to parallel-safe targets and docker build pipeline.

Files Changed

File	Changes Summary
.github/workflows/backend.yml	Added CUDA 12/13 builds for x86_64 and Jetson L4T; fixed voxcpm ARM64 build
Makefile	Added `faster-qwen3-tts` to `.NOTPARALLEL`, test, and docker build targets
backend/index.yaml	Registered backend with 9 variants (CUDA 12/13, L4T, dev tags)
backend/python/faster-qwen3-tts/Makefile	Standard backend build configuration
backend/python/faster-qwen3-tts/backend.py	gRPC server with CUDA enforcement and voice cloning support
backend/python/faster-qwen3-tts/install.sh	Installation script with unquoted path variables (see risks)
backend/python/faster-qwen3-tts/requirements-cublas12.txt	PyTorch + faster-qwen3-tts for CUDA 12
backend/python/faster-qwen3-tts/requirements-cublas13.txt	PyTorch + faster-qwen3-tts for CUDA 13
backend/python/faster-qwen3-tts/requirements-l4t12.txt	Jetson L4T 12.0+ dependencies
backend/python/faster-qwen3-tts/requirements-l4t13.txt	Jetson L4T 13.0+ dependencies
backend/python/faster-qwen3-tts/requirements.txt	Base dependencies (grpcio, soundfile, sox)
backend/python/faster-qwen3-tts/run.sh	Runtime wrapper with unquoted paths, missing `set -e`
backend/python/faster-qwen3-tts/test.py	Unit tests with CUDA-gated TTS test
backend/python/faster-qwen3-tts/test.sh	Test runner script

Review Focus Areas

Path quoting in shell scripts (install.sh:6-11, run.sh:2-7) - unquoted variables risk breakage with spaces.
Missing set -e in run.sh - errors in sourced scripts are silently ignored.
Redundant language check in backend.py:110-113 - not language already covers empty strings.

Architecture

Design Decisions: Voice cloning only (no base voice generation). CUDA enforcement at runtime rather than build time allows single image with runtime detection. Platform-specific requirements files isolate Jetson wheel dependencies.

Risks: Shell scripts have unquoted path variables (intentional? should be fixed). run.sh lacks set -e so failures propagate silently. Backend is GPU-only by design - no CPU fallback.

Merge Status

NOT MERGEABLE — PR Score 33/100, below threshold (50)

[H4] PR quality score (33) is below merge floor (50)
[H6] Code quality raw score (22) is below merge floor (40)

codity-chait · 2026-06-03T23:21:28Z

Workflow Diagrams

Automatically generated sequence diagrams showing the workflows in this PR

1. Workflow Analysis

Complex complexity • Components: faster-qwen3-tts gRPC backend, GitHub Actions CI/CD workflow, Docker build system

sequenceDiagram
    title: Faster Qwen3 TTS Backend Integration Workflow

    participant Dev as Developer
    participant GH as GitHub Actions
    participant Docker as Docker Build
    participant Make as Makefile
    participant Backend as faster_qwen3_tts
    participant LocalAI as LocalAI Core
    participant HF as HuggingFace

    Dev->>Make: Add faster-qwen3-tts to NOTPARALLEL list
    Make->>Make: Register backend metadata<br/>BACKEND_FASTER_QWEN3_TTS

    Dev->>GH: Push to faster-qwen3-tts branch
    GH->>GH: Trigger backend.yml workflow

    par CUDA 12 Build
        GH->>Docker: Build cublas CUDA 12<br/>linux/amd64
        Docker->>Backend: Install requirements-cublas12.txt
    and CUDA 13 Build
        GH->>Docker: Build cublas CUDA 13<br/>linux/amd64
        Docker->>Backend: Install requirements-cublas13.txt
    and L4T CUDA 12 Build
        GH->>Docker: Build l4t CUDA 12<br/>linux/arm64
        Docker->>Backend: Install requirements-l4t12.txt
    and L4T CUDA 13 Build
        GH->>Docker: Build l4t CUDA 13<br/>linux/arm64
        Docker->>Backend: Install requirements-l4t13.txt
    end

    Docker->>Backend: Run install.sh
    Backend->>HF: Download Qwen/Qwen3-TTS-12Hz-0.6B-Base

    Note over Backend: CUDA graph capture enabledbr/Voice clone only mode

    GH->>Docker: Tag and push images
    Docker->>GH: Publish to quay.io/localai-backends

    Dev->>Make: make docker-build-faster-qwen3-tts
    Make->>Docker: Build local-ai-backend:faster-qwen3-tts
    Docker->>Make: Save as backend-images/faster-qwen3-tts.tar
    Make->>LocalAI: Install via ocifile protocol

    LocalAI->>Backend: Initialize gRPC service
    Backend->>Backend: Check torch.cuda.is_available

    alt CUDA available
        Backend->>Backend: Load model with CUDA graph
    else CUDA not available
        Backend->>Backend: Exit with error
    end

    LocalAI->>Backend: TTS request via backend_pb2
    Backend->>Backend: Generate audio with soundfile
    Backend->>LocalAI: Return audio data

    Note over Backend,LocalAI: Backend registered inbr/backend/index.yaml forbr/multiple hardware variants

Note: Diagrams show detected patterns only. Complex workflows may require manual review.

codity-chait · 2026-06-03T23:23:12Z

+            key, value = opt.split(":", 1)
+            if is_float(value):
+                value = float(value)
+            elif is_int(value):
+                value = int(value)
+            elif value.lower() in ["true", "false"]:
+                value = value.lower() == "true"
+            self.options[key] = value
+
+        model_path = request.Model or "Qwen/Qwen3-TTS-12Hz-0.6B-Base"
+        self.audio_path = request.AudioPath if hasattr(request, 'AudioPath') and request.AudioPath else None


The option-parsing loop applies is_float before is_int. Since is_float succeeds for any valid integer string (e.g. "1" parses to 1.0), integer-valued options such as chunk_size will be stored as float instead of int. Later, int(chunk_size) is called defensively, but ref_text or other string options that happen to be purely numeric will also be silently mistyped. More critically, this means an integer option like chunk_size:512 becomes 512.0 (float), and if the upstream generate_voice_clone is strict about type, it may raise a TypeError at runtime.

Suggested fix

if is_int(value): value = int(value) elif is_float(value): value = float(value) elif value.lower() in ["true", "false"]: value = value.lower() == "true"

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices. ### Context File: backend/python/faster-qwen3-tts/backend.py Lines: 55-65 Issue Type: functional-high Severity: high Issue Description: The option-parsing loop applies `is_float` before `is_int`. Since `is_float` succeeds for any valid integer string (e.g. `"1"` parses to `1.0`), integer-valued options such as `chunk_size` will be stored as `float` instead of `int`. Later, `int(chunk_size)` is called defensively, but `ref_text` or other string options that happen to be purely numeric will also be silently mistyped. More critically, this means an integer option like `chunk_size:512` becomes `512.0` (float), and if the upstream `generate_voice_clone` is strict about type, it may raise a `TypeError` at runtime. Current Code: if is_float(value): value = float(value) elif is_int(value): value = int(value) elif value.lower() in ["true", "false"]: value = value.lower() == "true" --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow bash best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-chait · 2026-06-03T23:23:12Z

+                return backend_pb2.Result(
+                    success=False,
+                    message="dst (output path) is required"
+                )
+            text = request.text.strip()
+            if not text:


If TTS is called before LoadModel (or if LoadModel failed after the CUDA check but before setting self.audio_path, self.model_file, self.model_path, or self.options), accessing self.audio_path / self.options etc. in _get_ref_audio_path and the rest of TTS will raise an AttributeError, crashing the gRPC handler and returning no structured error to the caller. The server continues running but that request is silently aborted. A guard check for model readiness is needed.

Suggested fix

def TTS(self, request, context): try: if not hasattr(self, 'model') or self.model is None: return backend_pb2.Result( success=False, message="Model not loaded. Call LoadModel first." ) if not request.dst: return backend_pb2.Result( success=False, message="dst (output path) is required" )

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices. ### Context File: backend/python/faster-qwen3-tts/backend.py Lines: 100-105 Issue Type: robustness-high Severity: high Issue Description: If `TTS` is called before `LoadModel` (or if `LoadModel` failed after the CUDA check but before setting `self.audio_path`, `self.model_file`, `self.model_path`, or `self.options`), accessing `self.audio_path` / `self.options` etc. in `_get_ref_audio_path` and the rest of `TTS` will raise an `AttributeError`, crashing the gRPC handler and returning no structured error to the caller. The server continues running but that request is silently aborted. A guard check for model readiness is needed. Current Code: def TTS(self, request, context): try: if not request.dst: return backend_pb2.Result( success=False, message="dst (output path) is required" ) --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow bash best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-chait · 2026-06-03T23:23:12Z

+            audio_list, sr = self.model.generate_voice_clone(
+                text=text,
+                language=language,
+                ref_audio=ref_audio,


The output file path request.dst is passed directly to sf.write() without validation. A caller can write audio output to any filesystem location the process has access to (e.g. /etc/cron.d/backdoor.wav, overwriting existing files, or writing outside the intended temp directory). The path should be validated to fall within the expected output directory before use.

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices. ### Context File: backend/python/faster-qwen3-tts/backend.py Lines: 138-138 Issue Type: security-medium Severity: medium Issue Description: The output file path `request.dst` is passed directly to `sf.write()` without validation. A caller can write audio output to any filesystem location the process has access to (e.g. `/etc/cron.d/backdoor.wav`, overwriting existing files, or writing outside the intended temp directory). The path should be validated to fall within the expected output directory before use. Current Code: sf.write(request.dst, audio_data, sr) --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow bash best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-chait · 2026-06-03T23:23:12Z

+grpcio==1.71.0
+protobuf
+certifi
+packaging==24.1
+soundfile
+setuptools
+six
+sox


All dependency versions in requirements.txt are unpinned except grpcio==1.71.0 and packaging==24.1. This means protobuf, certifi, soundfile, setuptools, six, and sox will resolve to whatever is latest at install time, potentially breaking the backend silently on a future pip install if a breaking release is published. The CUDA-specific requirements files also leave torch, torchaudio, and faster-qwen3-tts fully unpinned. For a production backend image, pinning is strongly recommended.

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices. ### Context File: backend/python/faster-qwen3-tts/requirements.txt Lines: 1-8 Issue Type: maintainability-medium Severity: medium Issue Description: All dependency versions in `requirements.txt` are unpinned except `grpcio==1.71.0` and `packaging==24.1`. This means `protobuf`, `certifi`, `soundfile`, `setuptools`, `six`, and `sox` will resolve to whatever is latest at install time, potentially breaking the backend silently on a future pip install if a breaking release is published. The CUDA-specific requirements files also leave `torch`, `torchaudio`, and `faster-qwen3-tts` fully unpinned. For a production backend image, pinning is strongly recommended. Current Code: grpcio==1.71.0 protobuf certifi packaging==24.1 soundfile setuptools six sox --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow bash best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-chait · 2026-06-03T23:23:12Z

+            cwd=os.path.dirname(os.path.abspath(__file__)),
+        )


The test setUp unconditionally sleeps 15 seconds waiting for the backend process to start. If the process fails immediately (e.g. import error, port conflict), the test suite still waits the full 15 seconds and then gets cryptic gRPC connection-refused errors rather than a clear failure. A readiness poll with a short sleep-and-retry loop would both speed up the common case and give a clearer failure message.

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices. ### Context File: backend/python/faster-qwen3-tts/test.py Lines: 22-23 Issue Type: robustness-low Severity: low Issue Description: The test `setUp` unconditionally sleeps 15 seconds waiting for the backend process to start. If the process fails immediately (e.g. import error, port conflict), the test suite still waits the full 15 seconds and then gets cryptic gRPC connection-refused errors rather than a clear failure. A readiness poll with a short sleep-and-retry loop would both speed up the common case and give a clearer failure message. Current Code: self.service = subprocess.Popen( ["python3", "backend.py", "--addr", "localhost:50052"], --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow bash best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-chait · 2026-06-03T23:23:12Z

+                    CUDA=True,
+                ),
+                timeout=10.0,
+            )
+        self.assertFalse(response.success)
+
+    @unittest.skipUnless(
+        __import__("torch").cuda.is_available(),
+        "faster-qwen3-tts TTS requires CUDA",
+    )
+    def test_tts(self):
+        import soundfile as sf


test_load_model_requires_cuda asserts self.assertFalse(response.success) unconditionally. On a CI runner that does have a GPU, LoadModel will succeed and the assertion will fail, breaking the test suite on GPU-enabled hosts. The test intent is to confirm behaviour when CUDA is absent; it should be skipped when CUDA is available.

Suggested fix

@unittest.skipIf( __import__("torch").cuda.is_available(), "Skipping CUDA-required test on GPU-enabled host", ) def test_load_model_requires_cuda(self): with grpc.insecure_channel("localhost:50052") as channel: stub = backend_pb2_grpc.BackendStub(channel) response = stub.LoadModel( backend_pb2.ModelOptions( Model="Qwen/Qwen3-TTS-12Hz-0.6B-Base", CUDA=True, ), timeout=10.0, ) self.assertFalse(response.success)

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices. ### Context File: backend/python/faster-qwen3-tts/test.py Lines: 46-57 Issue Type: functional-medium Severity: medium Issue Description: `test_load_model_requires_cuda` asserts `self.assertFalse(response.success)` unconditionally. On a CI runner that *does* have a GPU, `LoadModel` will succeed and the assertion will fail, breaking the test suite on GPU-enabled hosts. The test intent is to confirm behaviour when CUDA is *absent*; it should be skipped when CUDA is available. Current Code: def test_load_model_requires_cuda(self): with grpc.insecure_channel("localhost:50052") as channel: stub = backend_pb2_grpc.BackendStub(channel) response = stub.LoadModel( backend_pb2.ModelOptions( Model="Qwen/Qwen3-TTS-12Hz-0.6B-Base", CUDA=True, ), timeout=10.0, ) self.assertFalse(response.success) --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow bash best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-chait · 2026-06-03T23:23:12Z

  uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-qwen-tts"
  mirrors:
    - localai/localai-backends:master-metal-darwin-arm64-qwen-tts
+## faster-qwen3-tts
+- !!merge <<: *faster-qwen3-tts


The faster-qwen3-tts-development entry (the first merged block after the anchor) overrides capabilities but does NOT set a uri or mirrors. This means the development variant has no image URI; when LocalAI tries to resolve and pull this backend image for a development/master install, it will find no image reference and fail silently or with a confusing error. The qwen-tts-development entry and others in the file all carry explicit uri + mirrors pointing to master-* tagged images.

Suggested fix

- !!merge <<: *faster-qwen3-tts name: "faster-qwen3-tts-development" uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-faster-qwen3-tts" mirrors: - localai/localai-backends:master-gpu-nvidia-cuda-12-faster-qwen3-tts capabilities: nvidia: "cuda12-faster-qwen3-tts-development" default: "cuda12-faster-qwen3-tts-development" nvidia-cuda-13: "cuda13-faster-qwen3-tts-development" nvidia-cuda-12: "cuda12-faster-qwen3-tts-development" nvidia-l4t: "nvidia-l4t-faster-qwen3-tts-development" nvidia-l4t-cuda-12: "nvidia-l4t-faster-qwen3-tts-development" nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-faster-qwen3-tts-development"

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices. ### Context File: backend/index.yaml Lines: 2301-2305 Issue Type: functional-medium Severity: medium Issue Description: The `faster-qwen3-tts-development` entry (the first merged block after the anchor) overrides `capabilities` but does NOT set a `uri` or `mirrors`. This means the development variant has no image URI; when LocalAI tries to resolve and pull this backend image for a development/master install, it will find no image reference and fail silently or with a confusing error. The `qwen-tts-development` entry and others in the file all carry explicit `uri` + `mirrors` pointing to `master-*` tagged images. Current Code: - !!merge <<: *faster-qwen3-tts name: "faster-qwen3-tts-development" capabilities: nvidia: "cuda12-faster-qwen3-tts-development" default: "cuda12-faster-qwen3-tts-development" nvidia-cuda-13: "cuda13-faster-qwen3-tts-development" nvidia-cuda-12: "cuda12-faster-qwen3-tts-development" nvidia-l4t: "nvidia-l4t-faster-qwen3-tts-development" nvidia-l4t-cuda-12: "nvidia-l4t-faster-qwen3-tts-development" nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-faster-qwen3-tts-development" --- ### Instructions 1. Fix the issue described above 2. Maintain the exact indentation and code style from the original 3. Follow bash best practices and language-specific idioms 4. Ensure the fix addresses the root cause, not just the symptoms 5. Add brief inline comments explaining the fix if needed ### Constraints - Do not change functionality beyond fixing the identified issue - Preserve existing variable names and function signatures unless they are part of the problem - Ensure the fix is production-ready ---

codity-chait · 2026-06-03T23:23:15Z

Nitpicks (Low Priority)

Found 3 low-priority suggestions for code improvement

Click to expand nitpicks

`backend/python/faster-qwen3-tts/backend.py` (lines 110-113)

The language check if not language or language == "" is redundant. If language is already None or an empty string "", the first condition not language is already True for both; the second language == "" is never independently needed. This is a minor logic smell with no functional impact.

Code Suggestion or Comments

language = request.language if hasattr(request, 'language') and request.language else None
            if not language:
                language = "English"

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/python/faster-qwen3-tts/backend.py
Lines: 110-113
Issue Type: functional-low
Severity: low

Issue Description:
The `language` check `if not language or language == ""` is redundant. If `language` is already `None` or an empty string `""`, the first condition `not language` is already `True` for both; the second `language == ""` is never independently needed. This is a minor logic smell with no functional impact.

Current Code:
            language = request.language if hasattr(request, 'language') and request.language else None
            if not language or language == "":
                language = "English"

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---

`backend/python/faster-qwen3-tts/install.sh` (lines 6-11)

The backend_dir variable and its uses in the if/else branches are unquoted: if [ -d $backend_dir/common ] and source $backend_dir/common/libbackend.sh. If the path contains spaces (common in some CI/developer setups), word splitting will break the directory check and the source command. All variable expansions should be double-quoted.

Code Suggestion or Comments

backend_dir=$(dirname "$0")
if [ -d "$backend_dir/common" ]; then
    source "$backend_dir/common/libbackend.sh"
else
    source "$backend_dir/../common/libbackend.sh"
fi

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/python/faster-qwen3-tts/install.sh
Lines: 6-11
Issue Type: robustness-low
Severity: low

Issue Description:
The `backend_dir` variable and its uses in the `if/else` branches are unquoted: `if [ -d $backend_dir/common ]` and `source $backend_dir/common/libbackend.sh`. If the path contains spaces (common in some CI/developer setups), word splitting will break the directory check and the source command. All variable expansions should be double-quoted.

Current Code:
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
    source $backend_dir/common/libbackend.sh
else
    source $backend_dir/../common/libbackend.sh
fi

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---

`backend/python/faster-qwen3-tts/run.sh` (lines 2-7)

Same unquoted-variable issue as install.sh: dirname $0, $backend_dir/common, and $backend_dir/../common/libbackend.sh are all unquoted. Paths with spaces will break. Also run.sh is missing set -e (present in install.sh and test.sh) so errors in sourced scripts are silently ignored.

Code Suggestion or Comments

#!/bin/bash
set -e
backend_dir=$(dirname "$0")
if [ -d "$backend_dir/common" ]; then
    source "$backend_dir/common/libbackend.sh"
else
    source "$backend_dir/../common/libbackend.sh"
fi

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/python/faster-qwen3-tts/run.sh
Lines: 2-7
Issue Type: robustness-low
Severity: low

Issue Description:
Same unquoted-variable issue as `install.sh`: `dirname $0`, `$backend_dir/common`, and `$backend_dir/../common/libbackend.sh` are all unquoted. Paths with spaces will break. Also `run.sh` is missing `set -e` (present in `install.sh` and `test.sh`) so errors in sourced scripts are silently ignored.

Current Code:
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
    source $backend_dir/common/libbackend.sh
else
    source $backend_dir/../common/libbackend.sh
fi

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---

codity-chait · 2026-06-03T23:23:45Z

Security Scan Summary

Metric	Value
Vulnerabilities	Critical: 0
Overall Risk	Clean
Files Scanned	14

No critical security issues detected

Scan completed in 15.1s

Security scan powered by Codity.ai

codity-chait · 2026-06-03T23:25:52Z

License Compliance Scan

Metric	Value
Packages Scanned	337
High Risk (Strong Copyleft)	0
Medium Risk (Weak Copyleft)	4
Low Risk (Permissive)	271
Unknown License	62

Weak copyleft licenses found - verify compatibility

Some packages have unknown licenses - manual review required

Medium Risk Licenses - 4 packages

MPL-2.0 (4 packages):

github.com/libp2p/go-yamux/v5 5.0.1
github.com/hashicorp/golang-lru/v2 2.0.7
github.com/hashicorp/golang-lru 1.0.2
github.com/shoenig/go-m1cpu 0.1.6

Unknown Licenses - 62 packages

github.com/go-openapi/jsonreference 0.21.0
github.com/go-openapi/swag 0.23.0
github.com/go-openapi/spec 0.21.0
github.com/gogo/protobuf 1.3.2
github.com/google/go-cmp 0.7.0
github.com/golang/groupcache 0.0.0-20210331224755-41bb18bfe9da
github.com/google/btree 1.1.3
github.com/golang/snappy 0.0.4
github.com/google/gopacket 1.1.19
github.com/google/pprof 0.0.0-20260115054156-294ebfa9ad83
github.com/gorilla/css 1.0.1
github.com/gorilla/websocket 1.5.3
github.com/huin/goupnp 1.3.0
github.com/jaypipes/pcidb 1.1.1
github.com/klauspost/compress 1.18.1
github.com/libp2p/go-netroute 0.2.2
github.com/libp2p/go-reuseport 0.4.0
github.com/lufia/plan9stats 0.0.0-20250317134145-8bc96cf8fc35
github.com/marten-seemann/tcp 0.0.0-20210406111302-dfbc87cc63fd
github.com/moby/sys/sequential 0.6.0

...and 42 more

Powered by Codity.ai · Docs

codity-chait · 2026-06-03T23:26:10Z

Code Quality Report — test-org-codity/LocalAI · PR #9

Scanned: 2026-06-03 23:26 UTC | Score: 22/100 | Provider: github

Executive Summary

Severity	Count
Critical	0
High	1
Medium	4
Low	113

Top Findings

[CQ-LLM-001] `.github/workflows/backend.yml:211` (Duplication · HIGH)

Issue: Repeated configuration blocks for 'cublas' and 'l4t' build types with similar parameters.
Suggestion: Consolidate similar build configurations into a single template or parameterized structure to adhere to DRY principles.

- build-type: 'cublas'
  cuda-major-version: "12"
  cuda-minor-version: "8"
  platforms: 'linux/amd64'
  tag-latest: 'auto'
  tag-suffix: '-gpu-nvidia-cuda-12-faster-qwen3-tts'
  runs-on: 'ubuntu-latest'
  base-image: "ubuntu:24.04"
  skip-drivers: 'false'
  backend: "faster-qwen3-tts"
  dockerfile: "./backend/Dockerfile.python"
  context: "./"
  ubuntu-version: '2404'

[CQ-LLM-002] `backend/index.yaml:529` (Maintainability · MEDIUM)

Issue: Use of magic strings for backend names and URLs without explanation.
Suggestion: Define constants or use a configuration file to manage these values, improving readability and maintainability.

name: "faster-qwen3-tts"

[CQ-LLM-003] `backend/index.yaml:529` (Documentation · MEDIUM)

Issue: Missing documentation for the new backend configuration for 'faster-qwen3-tts'.
Suggestion: Add comments or documentation to explain the purpose and usage of the new backend configuration.

description: |
  Real-time Qwen3-TTS inference using CUDA graph capture. Voice clone only; requires NVIDIA GPU with CUDA.

[CQ-012] `backend/python/faster-qwen3-tts/backend.py:184` (Performance · MEDIUM)

Issue: Synchronous I/O call may block the event loop in async context
Suggestion: Use async alternatives (aiofiles, httpx, asyncio.sleep)

time.sleep(_ONE_DAY_IN_SECONDS)

[CQ-012] `backend/python/faster-qwen3-tts/test.py:24` (Performance · MEDIUM)

Issue: Synchronous I/O call may block the event loop in async context
Suggestion: Use async alternatives (aiofiles, httpx, asyncio.sleep)

time.sleep(15)

[CQ-LLM-004] `.github/workflows/backend.yml:211` (Style · LOW)

Issue: Inconsistent formatting in YAML configuration.
Suggestion: Ensure consistent indentation and formatting across all YAML files for better readability.

tag-suffix: '-gpu-nvidia-cuda-12-faster-qwen3-tts'

[CQ-008] `.github/workflows/backend.yml:214` (Maintainability · LOW)

Issue: Magic number 12 in code
Suggestion: Extract to a named constant

cuda-major-version: "12"

[CQ-008] `.github/workflows/backend.yml:218` (Maintainability · LOW)

Issue: Magic number 12 in code
Suggestion: Extract to a named constant

tag-suffix: '-gpu-nvidia-cuda-12-faster-qwen3-tts'

[CQ-008] `.github/workflows/backend.yml:220` (Maintainability · LOW)

Issue: Magic number 24 in code
Suggestion: Extract to a named constant

base-image: "ubuntu:24.04"

[CQ-008] `.github/workflows/backend.yml:225` (Maintainability · LOW)

Issue: Magic number 2404 in code
Suggestion: Extract to a named constant

ubuntu-version: '2404'

Per-File Breakdown

File	High	Medium	Low	Total
`.github/workflows/backend.yml`	1	0	17	18
`Makefile`	0	0	2	2
`backend/index.yaml`	0	2	20	22
`backend/python/faster-qwen3-tts/backend.py`	0	1	24	25
`backend/python/faster-qwen3-tts/requirements.txt`	0	0	1	1
`backend/python/faster-qwen3-tts/test.py`	0	1	49	50

Recommendations

Resolve High severity issues, especially error handling gaps and performance bottlenecks.

Run automated tests after applying fixes to verify no regressions.

mudler added 2 commits February 26, 2026 23:00

feat(backends): add faster-qwen3-tts

51c26f1

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

fix: this backend is CUDA only

e169492

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

codity-ai Bot reviewed May 21, 2026

View reviewed changes

codity-chait Bot reviewed Jun 3, 2026

View reviewed changes

Conversation

DhirenMhatre commented May 21, 2026

Uh oh!

DhirenMhatre commented May 21, 2026

Uh oh!

codity-ai Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

What Changed

Key Changes by Area

Files Changed

Review Focus Areas

Architecture

Merge Status

Uh oh!

codity-ai Bot commented May 21, 2026

Workflow Diagrams

1. ## Analysis

Uh oh!

codity-ai Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

codity-ai Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

codity-ai Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

codity-ai Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

codity-ai Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

codity-ai Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

codity-ai Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

codity-ai Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

codity-ai Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

codity-ai Bot commented May 21, 2026

Nitpicks (Low Priority)

backend/python/faster-qwen3-tts/test.sh (lines 4-8)

Uh oh!

codity-ai Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Security Scan Summary

Scan completed in 9.5s

Uh oh!

codity-ai Bot commented May 21, 2026

Dependency vulnerability scanning

Uh oh!

codity-ai Bot commented May 21, 2026

License Compliance Scan

Uh oh!

codity-ai Bot commented May 21, 2026

Code Quality Report — test-org-codity/LocalAI · PR #9

Executive Summary

[CQ-LLM-001] .github/workflows/backend.yml:211 (Duplication · HIGH)

[CQ-LLM-002] backend/index.yaml:529 (Maintainability · MEDIUM)

[CQ-LLM-003] backend/index.yaml:529 (Documentation · MEDIUM)

[CQ-012] backend/python/faster-qwen3-tts/backend.py:184 (Performance · MEDIUM)

[CQ-012] backend/python/faster-qwen3-tts/test.py:24 (Performance · MEDIUM)

[CQ-008] .github/workflows/backend.yml:214 (Maintainability · LOW)

[CQ-008] .github/workflows/backend.yml:218 (Maintainability · LOW)

[CQ-008] .github/workflows/backend.yml:220 (Maintainability · LOW)

[CQ-008] .github/workflows/backend.yml:225 (Maintainability · LOW)

[CQ-008] .github/workflows/backend.yml:592 (Maintainability · LOW)

Per-File Breakdown

Recommendations

Uh oh!

chay2199 commented Jun 3, 2026

Uh oh!

codity-chait Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

codity-ai Bot commented May 21, 2026 •

edited

Loading

`backend/python/faster-qwen3-tts/test.sh` (lines 4-8)

codity-ai Bot commented May 21, 2026 •

edited

Loading

[CQ-LLM-001] `.github/workflows/backend.yml:211` (Duplication · HIGH)

[CQ-LLM-002] `backend/index.yaml:529` (Maintainability · MEDIUM)

[CQ-LLM-003] `backend/index.yaml:529` (Documentation · MEDIUM)

[CQ-012] `backend/python/faster-qwen3-tts/backend.py:184` (Performance · MEDIUM)

[CQ-012] `backend/python/faster-qwen3-tts/test.py:24` (Performance · MEDIUM)

[CQ-008] `.github/workflows/backend.yml:214` (Maintainability · LOW)

[CQ-008] `.github/workflows/backend.yml:218` (Maintainability · LOW)

[CQ-008] `.github/workflows/backend.yml:220` (Maintainability · LOW)

[CQ-008] `.github/workflows/backend.yml:225` (Maintainability · LOW)

[CQ-008] `.github/workflows/backend.yml:592` (Maintainability · LOW)

codity-chait Bot commented Jun 3, 2026 •

edited

Loading

`backend/python/faster-qwen3-tts/backend.py` (lines 110-113)

`backend/python/faster-qwen3-tts/install.sh` (lines 6-11)

`backend/python/faster-qwen3-tts/run.sh` (lines 2-7)

codity-chait Bot commented Jun 3, 2026 •

edited

Loading

[CQ-LLM-001] `.github/workflows/backend.yml:211` (Duplication · HIGH)

[CQ-LLM-002] `backend/index.yaml:529` (Maintainability · MEDIUM)

[CQ-LLM-003] `backend/index.yaml:529` (Documentation · MEDIUM)

[CQ-012] `backend/python/faster-qwen3-tts/backend.py:184` (Performance · MEDIUM)

[CQ-012] `backend/python/faster-qwen3-tts/test.py:24` (Performance · MEDIUM)

[CQ-LLM-004] `.github/workflows/backend.yml:211` (Style · LOW)

[CQ-008] `.github/workflows/backend.yml:214` (Maintainability · LOW)

[CQ-008] `.github/workflows/backend.yml:218` (Maintainability · LOW)

[CQ-008] `.github/workflows/backend.yml:220` (Maintainability · LOW)

[CQ-008] `.github/workflows/backend.yml:225` (Maintainability · LOW)