Skip to content

Faster qwen3 tts#9

Open
DhirenMhatre wants to merge 2 commits into
masterfrom
faster-qwen3-tts
Open

Faster qwen3 tts#9
DhirenMhatre wants to merge 2 commits into
masterfrom
faster-qwen3-tts

Conversation

@DhirenMhatre
Copy link
Copy Markdown

Description

This PR fixes #

Notes for Reviewers

Signed commits

  • Yes, I signed my commits.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
@DhirenMhatre
Copy link
Copy Markdown
Author

@codity review

@codity-ai
Copy link
Copy Markdown

codity-ai Bot commented May 21, 2026

PR Summary

What Changed

  • Adds a new faster-qwen3-tts gRPC backend for real-time Qwen3-TTS voice cloning with CUDA graph capture.
  • Expands CI/CD to build CUDA 12/13 variants for AMD64 and NVIDIA L4T (ARM64) platforms.
  • Registers 8 backend variants in the index with development and latest tags.

Key Changes by Area

Backend Implementation: New gRPC server at backend/python/faster-qwen3-tts/ with LoadModel and TTS methods, voice cloning via reference audio/text.

Build System: Added to .NOTPARALLEL, prepare-test-extra, test-extra, and docker-build-backends Makefile targets.

CI/CD: New CUDA 12/13 builds for linux/amd64 and linux/arm64 (L4T). Also expanded voxcpm CPU builds to ARM64.

Files Changed

File Changes Summary
.github/workflows/backend.yml Added CUDA 12/13 builds for AMD64 and L4T ARM64; expanded voxcpm platforms
Makefile Added faster-qwen3-tts to parallelization guards and test/build targets
backend/index.yaml Registered faster-qwen3-tts with 8 variant entries
backend/python/faster-qwen3-tts/Makefile Build automation for the new backend
backend/python/faster-qwen3-tts/backend.py gRPC server implementation with LoadModel and TTS methods
backend/python/faster-qwen3-tts/install.sh Installation script
backend/python/faster-qwen3-tts/requirements-cublas12.txt CUDA 12 dependencies
backend/python/faster-qwen3-tts/requirements-cublas13.txt CUDA 13 dependencies
backend/python/faster-qwen3-tts/requirements-l4t12.txt Jetson L4T 12 with CUDA 12.9 PyTorch wheels
backend/python/faster-qwen3-tts/requirements-l4t13.txt Jetson L4T 13 with CUDA 13.0 PyTorch wheels
backend/python/faster-qwen3-tts/requirements.txt Base dependencies (grpcio, protobuf, soundfile, sox)
backend/python/faster-qwen3-tts/run.sh Standard backend wrapper script
backend/python/faster-qwen3-tts/test.py Unit tests with CUDA gating and end-to-end TTS test
backend/python/faster-qwen3-tts/test.sh Test runner with unquoted path expansions

Review Focus Areas

  • Path quoting in backend/python/faster-qwen3-tts/test.sh:4-8 (unquoted expansions break on spaces/globs).
  • CUDA graph capture error handling in backend.py for edge cases on different GPU architectures.
  • L4T wheel compatibility: verify PyTorch CUDA 12.9/13.0 wheels work across Jetson variants.

Architecture

Design Decisions: Voice-clone only (no base voice synthesis) keeps scope narrow. CUDA graph capture trades flexibility for latency. Separate requirements files per CUDA version and platform (L4T vs desktop) manage dependency complexity explicitly.

Scalability & Extensibility: Backend follows existing gRPC pattern. 8 variants in index.yaml allow version pinning but increase maintenance surface. Out of scope: CPU fallback or quantization.

Risks: Intentional: NVIDIA-only, no CPU support. Unintentional: L4T wheel URLs may rot; test coverage gaps for actual hardware (tests gate on CUDA availability).

Merge Status

NOT MERGEABLE — PR Score 13/100, below threshold (50)

  • [H4] PR quality score (13) is below merge floor (50)
  • [H5] 7 HIGH-severity inline review findings need resolution (threshold: 3)
  • [H6] Code quality raw score (22) is below merge floor (40)

@codity-ai
Copy link
Copy Markdown

codity-ai Bot commented May 21, 2026

Workflow Diagrams

Automatically generated sequence diagrams showing the workflows in this PR

1. ## Analysis

This PR introduces a new TTS (Text-to-Speech...

Complex complexity • Components: faster-qwen3-tts gRPC backend service, GitHub Actions CI/CD workflow, Docker build matrix

sequenceDiagram
    title: faster-qwen3-tts Backend Integration Workflow

    participant Dev as Developer
    participant GH as GitHub Actions CI
    participant MK as Makefile
    participant DK as Docker Build
    participant BI as Backend Index
    participant LA as LocalAI Runtime

    Note over Dev,LA: PR introduces faster-qwen3-tts TTS backend with multi-platform CUDA support

    Dev->>MK: Add BACKEND_FASTER_QWEN3_TTS metadata<br/>Makefile:2/3 chunk
    MK-->>Dev: Register python backend with root context

    Dev->>GH: Push changes triggering backend.yml workflow

    par Parallel Build Matrix
        GH->>DK: Build CUDA 12 amd64<br/>backend.yml:210+
        Note right of DK: cublas, cuda 12.8, ubuntu 24.04
        DK-->>GH: Image: local-ai-backends:latest-gpu-nvidia-cuda-12-faster-qwen3-tts

        GH->>DK: Build CUDA 13 amd64<br/>backend.yml:588+
        Note right of DK: cublas, cuda 13.0, ubuntu 24.04
        DK-->>GH: Image: local-ai-backends:latest-gpu-nvidia-cuda-13-faster-qwen3-tts

        GH->>DK: Build L4T CUDA 12 arm64<br/>backend.yml:1345+
        Note right of DK: NVIDIA Jetson, arm64, ubuntu 22.04
        DK-->>GH: Image: local-ai-backends:latest-nvidia-l4t-faster-qwen3-tts

        GH->>DK: Build L4T CUDA 13 arm64<br/>backend.yml:731+
        Note right of DK: NVIDIA Jetson, arm64, ubuntu 24.04
        DK-->>GH: Image: local-ai-backends:latest-nvidia-l4t-cuda-13-arm64-faster-qwen3-tts
    end

    GH->>BI: Register backend variants<br/>backend/index.yaml
    Note right of BI: YAML anchors define image URIsbr/for runtime discovery

    LA->>BI: Query available backends at startup
    BI-->>LA: Return faster-qwen3-tts image URIs<br/>for detected platform

    alt CUDA GPU available
        LA->>DK: Pull and run faster-qwen3-tts container
        DK-->>LA: gRPC service ready on configured port
        LA->>DK: TTS request via gRPC
        DK-->>LA: Audio output stream
    else CPU only or CUDA unavailable
        LA->>LA: Skip faster-qwen3-tts registration
        Note right of LA: Backend requires NVIDIA GPUbr/Graceful degradation to CPU TTS
    end
Loading

Note: Diagrams show detected patterns only. Complex workflows may require manual review.

Comment thread backend/index.yaml
Comment on lines +544 to +547
capabilities:
nvidia: "cuda12-faster-qwen3-tts"
default: "cuda12-faster-qwen3-tts"
nvidia-cuda-13: "cuda13-faster-qwen3-tts"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functional High

Selecting the new backend with capability default on a CPU-only host now resolves to cuda12-faster-qwen3-tts, so a request like backend: faster-qwen3-tts without NVIDIA hardware will always pull a CUDA image and fail at runtime because backend.py rejects torch.cuda.is_available()==False.

Remove the default capability for this backend or point it to no image/CPU-safe fallback so generic backend selection does not choose a CUDA-only container on non-NVIDIA machines.

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert yaml developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/index.yaml
Lines: 544-547
Issue Type: functional-high
Severity: high

Issue Description:
Selecting the new backend with capability `default` on a CPU-only host now resolves to `cuda12-faster-qwen3-tts`, so a request like `backend: faster-qwen3-tts` without NVIDIA hardware will always pull a CUDA image and fail at runtime because `backend.py` rejects `torch.cuda.is_available()==False`.

Current Code:
  capabilities:
    nvidia: "cuda12-faster-qwen3-tts"
    default: "cuda12-faster-qwen3-tts"
    nvidia-cuda-13: "cuda13-faster-qwen3-tts"
    nvidia-cuda-12: "cuda12-faster-qwen3-tts"

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow yaml best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue

Comment on lines +99 to +149
if not request.dst:
return backend_pb2.Result(
success=False,
message="dst (output path) is required"
)
text = request.text.strip()
if not text:
return backend_pb2.Result(
success=False,
message="Text is empty"
)

language = request.language if hasattr(request, 'language') and request.language else None
if not language or language == "":
language = "English"

ref_audio = self._get_ref_audio_path(request)
if not ref_audio:
return backend_pb2.Result(
success=False,
message="AudioPath is required for voice clone (set in LoadModel)"
)
ref_text = self.options.get("ref_text")
if not ref_text and hasattr(request, 'ref_text') and request.ref_text:
ref_text = request.ref_text
if not ref_text:
return backend_pb2.Result(
success=False,
message="ref_text is required for voice clone (set via LoadModel Options, e.g. ref_text:Your reference transcript)"
)

chunk_size = self.options.get("chunk_size")
generation_kwargs = {}
if chunk_size is not None:
generation_kwargs["chunk_size"] = int(chunk_size)

audio_list, sr = self.model.generate_voice_clone(
text=text,
language=language,
ref_audio=ref_audio,
ref_text=ref_text,
**generation_kwargs
)

if audio_list is None or (isinstance(audio_list, list) and len(audio_list) == 0):
return backend_pb2.Result(
success=False,
message="No audio output generated"
)
audio_data = audio_list[0] if isinstance(audio_list, list) else audio_list
sf.write(request.dst, audio_data, sr)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security High

The server writes to a client-controlled filesystem path without validation, so restrict dst to an allowed output directory and reject absolute or traversing paths.

Suggested fix
            if not request.dst:
                return backend_pb2.Result(
                    success=False,
                    message="dst (output path) is required"
                )
            output_base = os.environ.get("LOCALAI_OUTPUT_DIR", "/tmp/localai")
            requested_path = os.path.normpath(request.dst)
            if os.path.isabs(requested_path):
                return backend_pb2.Result(success=False, message="absolute dst paths are not allowed")
            final_path = os.path.normpath(os.path.join(output_base, requested_path))
            if not final_path.startswith(os.path.abspath(output_base) + os.sep):
                return backend_pb2.Result(success=False, message="invalid dst path")
...
            sf.write(final_path, audio_data, sr)
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert python developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/python/faster-qwen3-tts/backend.py
Lines: 99-149
Issue Type: security-high
Severity: high

Issue Description:
The server writes to a client-controlled filesystem path without validation, so restrict dst to an allowed output directory and reject absolute or traversing paths.

Current Code:
            if not request.dst:
                return backend_pb2.Result(
                    success=False,
                    message="dst (output path) is required"
                )
...
            sf.write(request.dst, audio_data, sr)

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow python best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue

Comment on lines +82 to +95
if not self.audio_path:
return None
if os.path.isabs(self.audio_path):
return self.audio_path
if self.model_file:
model_file_base = os.path.dirname(self.model_file)
ref_path = os.path.join(model_file_base, self.audio_path)
if os.path.exists(ref_path):
return ref_path
if self.model_path:
ref_path = os.path.join(self.model_path, self.audio_path)
if os.path.exists(ref_path):
return ref_path
return self.audio_path
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security High

AudioPath is accepted as an absolute path and otherwise returned unchecked, so validate it against an allowlisted base directory before opening local files.

Suggested fix
    def _get_ref_audio_path(self, request):
        if not self.audio_path:
            return None
        allowed_bases = []
        if self.model_file:
            allowed_bases.append(os.path.abspath(os.path.dirname(self.model_file)))
        if self.model_path:
            allowed_bases.append(os.path.abspath(self.model_path))
        for base in allowed_bases:
            candidate = os.path.abspath(os.path.join(base, self.audio_path))
            if candidate.startswith(base + os.sep) and os.path.exists(candidate):
                return candidate
        return None
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert python developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/python/faster-qwen3-tts/backend.py
Lines: 82-95
Issue Type: security-high
Severity: high

Issue Description:
AudioPath is accepted as an absolute path and otherwise returned unchecked, so validate it against an allowlisted base directory before opening local files.

Current Code:
    def _get_ref_audio_path(self, request):
        if not self.audio_path:
            return None
        if os.path.isabs(self.audio_path):
            return self.audio_path
        if self.model_file:
            model_file_base = os.path.dirname(self.model_file)
            ref_path = os.path.join(model_file_base, self.audio_path)
            if os.path.exists(ref_path):
                return ref_path
        if self.model_path:
            ref_path = os.path.join(self.model_path, self.audio_path)
            if os.path.exists(ref_path):
                return ref_path
        return self.audio_path

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow python best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue

Comment on lines +64 to +72
model_path = request.Model or "Qwen/Qwen3-TTS-12Hz-0.6B-Base"
self.audio_path = request.AudioPath if hasattr(request, 'AudioPath') and request.AudioPath else None
self.model_file = request.ModelFile if hasattr(request, 'ModelFile') and request.ModelFile else None
self.model_path = request.ModelPath if hasattr(request, 'ModelPath') and request.ModelPath else None

from faster_qwen3_tts import FasterQwen3TTS
print(f"Loading model from: {model_path}", file=sys.stderr)
try:
self.model = FasterQwen3TTS.from_pretrained(model_path)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security High

Loading an arbitrary model identifier from the request lets remote callers trigger untrusted model downloads or local path access, so restrict model_path to an allowlist or server-side configuration.

Suggested fix
        allowed_models = {
            "Qwen/Qwen3-TTS-12Hz-0.6B-Base",
        }
        model_path = request.Model or "Qwen/Qwen3-TTS-12Hz-0.6B-Base"
        if model_path not in allowed_models:
            return backend_pb2.Result(success=False, message="unsupported model")
...
            self.model = FasterQwen3TTS.from_pretrained(model_path)
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert python developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/python/faster-qwen3-tts/backend.py
Lines: 64-72
Issue Type: security-high
Severity: high

Issue Description:
Loading an arbitrary model identifier from the request lets remote callers trigger untrusted model downloads or local path access, so restrict model_path to an allowlist or server-side configuration.

Current Code:
        model_path = request.Model or "Qwen/Qwen3-TTS-12Hz-0.6B-Base"
...
            self.model = FasterQwen3TTS.from_pretrained(model_path)

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow python best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue

--extra-index-url https://download.pytorch.org/whl/cu130
torch
torchaudio
faster-qwen3-tts
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security High

Using an unpinned package name for faster-qwen3-tts allows a supply chain swap to a newer or compromised release, so pin it to an exact version.

Suggested change
faster-qwen3-tts
faster-qwen3-tts==<exact-version>
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert text developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/python/faster-qwen3-tts/requirements-cublas13.txt
Lines: 4-4
Issue Type: security-high
Severity: high

Issue Description:
Using an unpinned package name for `faster-qwen3-tts` allows a supply chain swap to a newer or compromised release, so pin it to an exact version.

Current Code:
faster-qwen3-tts

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow text best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue

Comment on lines +1 to +4
--extra-index-url https://download.pytorch.org/whl/cu130
torch
torchaudio
faster-qwen3-tts
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security High

Using an untrusted extra package index allows dependency confusion; pin the package to the official source or use --index-url if all packages must come from the PyTorch index.

Also reported at: backend/python/faster-qwen3-tts/requirements-l4t12.txt L1–L4

Suggested fix
--index-url https://download.pytorch.org/whl/cu130
torch==<pinned-version>
torchaudio==<pinned-version>
--extra-index-url https://pypi.org/simple
faster-qwen3-tts==<pinned-version>
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert text developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/python/faster-qwen3-tts/requirements-l4t13.txt
Lines: 1-4
Issue Type: security-high
Severity: high

Issue Description:
Using an untrusted extra package index allows dependency confusion; pin the package to the official source or use --index-url if all packages must come from the PyTorch index.

_Also reported at: `backend/python/faster-qwen3-tts/requirements-l4t12.txt` L1–L4_

Current Code:
--extra-index-url https://download.pytorch.org/whl/cu130
torch
torchaudio
faster-qwen3-tts

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow text best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue

source $backend_dir/../common/libbackend.sh
fi

startBackend $@
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functional High

Using $@ unquoted changes argument boundaries, so call startBackend with "$@".

Suggested change
startBackend $@
startBackend "$@"
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/python/faster-qwen3-tts/run.sh
Lines: 9-9
Issue Type: functional-high
Severity: high

Issue Description:
Using `$@` unquoted changes argument boundaries, so call `startBackend` with `"$@"`.

Current Code:
startBackend $@

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue

Comment on lines +2 to +6
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Medium

Unquoted path expansions allow word splitting and option interpretation, so quote $0 and all $backend_dir uses.

Also reported at: backend/python/faster-qwen3-tts/install.sh L6–L10

Suggested fix
backend_dir=$(dirname "$0")
if [ -d "$backend_dir/common" ]; then
    source "$backend_dir/common/libbackend.sh"
else
    source "$backend_dir/../common/libbackend.sh"
fi
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/python/faster-qwen3-tts/run.sh
Lines: 2-6
Issue Type: security-medium
Severity: medium

Issue Description:
Unquoted path expansions allow word splitting and option interpretation, so quote `$0` and all `$backend_dir` uses.

_Also reported at: `backend/python/faster-qwen3-tts/install.sh` L6–L10_

Current Code:
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
    source $backend_dir/common/libbackend.sh
else
    source $backend_dir/../common/libbackend.sh
fi

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue

Comment on lines +52 to +54
@unittest.skipUnless(
__import__("torch").cuda.is_available(),
"faster-qwen3-tts TTS requires CUDA",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functional Medium

The skip condition imports torch at module load time and will crash the test file on environments without torch installed, so guard the import and default to skipping.

Suggested fix
    try:
        import torch
        _HAS_CUDA = torch.cuda.is_available()
    except Exception:
        _HAS_CUDA = False

    @unittest.skipUnless(
        _HAS_CUDA,
        "faster-qwen3-tts TTS requires CUDA",
    )
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert python developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/python/faster-qwen3-tts/test.py
Lines: 52-54
Issue Type: functional-medium
Severity: medium

Issue Description:
The skip condition imports torch at module load time and will crash the test file on environments without torch installed, so guard the import and default to skipping.

Current Code:
    @unittest.skipUnless(
        __import__("torch").cuda.is_available(),
        "faster-qwen3-tts TTS requires CUDA",
    )

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow python best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue

@codity-ai
Copy link
Copy Markdown

codity-ai Bot commented May 21, 2026

Nitpicks (Low Priority)

Found 1 low-priority suggestions for code improvement

Click to expand nitpicks

backend/python/faster-qwen3-tts/test.sh (lines 4-8)

Robustness Low

The path expansions are unquoted, so spaces or glob characters in the script path can break the directory test and source command; quote the variable expansions.

Code Suggestion or Comments
backend_dir=$(dirname "$0")
if [ -d "$backend_dir/common" ]; then
    source "$backend_dir/common/libbackend.sh"
else
    source "$backend_dir/../common/libbackend.sh"
fi
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/python/faster-qwen3-tts/test.sh
Lines: 4-8
Issue Type: robustness-low
Severity: low

Issue Description:
The path expansions are unquoted, so spaces or glob characters in the script path can break the directory test and source command; quote the variable expansions.

Current Code:
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
    source $backend_dir/common/libbackend.sh
else
    source $backend_dir/../common/libbackend.sh
fi

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---



Like Dislike

@codity-ai
Copy link
Copy Markdown

codity-ai Bot commented May 21, 2026

Security Scan Summary

Metric Value
Vulnerabilities Critical: 0
Overall Risk Clean
Files Scanned 14

No critical security issues detected

Scan completed in 9.5s

Security scan powered by Codity.ai

@codity-ai
Copy link
Copy Markdown

codity-ai Bot commented May 21, 2026

Dependency vulnerability scanning

Metric Value
Vulnerabilities Found 4
Scanner pip-audit
View vulnerability details (4 items)

1. pip 24.0

CVE: GHSA-4xh5-x5gv-qwph
Fixed in: 25.3

When extracting a tar archive pip may not check symbolic links point into the extraction directory if the tarfile module doesn't implement PEP 706. Note that upgrading pip to a "fixed" version for thi


2. pip 24.0

CVE: GHSA-6vgw-5pg2-w6jp
Fixed in: 26.0

When pip is installing and extracting a maliciously crafted wheel archive, files may be extracted outside the installation directory. The path traversal is limited to prefixes of the installation dire


3. pip 24.0

CVE: GHSA-58qw-9mgm-455v
Fixed in: 26.1

pip handles concatenated tar and ZIP files as ZIP files regardless of filename or whether a file is both a tar and ZIP file. This behavior could result in confusing installation behavior, such as inst


4. pip 24.0

CVE: GHSA-jp4c-xjxw-mgf9
Fixed in: 26.1

pip prior to version 26.1 would run self-update check functionality after installing wheel files which required importing well-known Python modules names. These module imports were intentionally defer

Powered by Codity.ai · Docs

@codity-ai
Copy link
Copy Markdown

codity-ai Bot commented May 21, 2026

License Compliance Scan

Metric Value
Packages Scanned 337
High Risk (Strong Copyleft) 0
Medium Risk (Weak Copyleft) 4
Low Risk (Permissive) 260
Unknown License 73

Weak copyleft licenses found - verify compatibility

Some packages have unknown licenses - manual review required

Medium Risk Licenses - 4 packages

MPL-2.0 (4 packages):

  • github.com/libp2p/go-yamux/v5 5.0.1
  • github.com/hashicorp/golang-lru 1.0.2
  • github.com/hashicorp/golang-lru/v2 2.0.7
  • github.com/shoenig/go-m1cpu 0.1.6
Unknown Licenses - 73 packages
  • github.com/docker/docker-credential-helpers 0.9.3
  • github.com/docker/cli 29.0.3+incompatible
  • github.com/docker/go-connections 0.6.0
  • github.com/docker/go-units 0.5.0
  • github.com/dsnet/compress 0.0.2-0.20210315054119-f66993602bf5
  • github.com/flynn/noise 1.1.0
  • github.com/go-audio/audio 1.0.0
  • github.com/go-logr/logr 1.4.3
  • github.com/go-audio/riff 1.0.0
  • github.com/go-logr/stdr 1.2.2
  • github.com/go-openapi/jsonpointer 0.21.0
  • github.com/go-openapi/swag 0.23.0
  • github.com/go-openapi/jsonreference 0.21.0
  • github.com/gogo/protobuf 1.3.2
  • github.com/go-openapi/spec 0.21.0
  • github.com/golang/groupcache 0.0.0-20210331224755-41bb18bfe9da
  • github.com/golang/snappy 0.0.4
  • github.com/google/btree 1.1.3
  • github.com/google/go-cmp 0.7.0
  • github.com/google/pprof 0.0.0-20260115054156-294ebfa9ad83

...and 53 more

Powered by Codity.ai · Docs

@codity-ai
Copy link
Copy Markdown

codity-ai Bot commented May 21, 2026

Code Quality Report — test-org-codity/LocalAI · PR #9

Scanned: 2026-05-21 19:50 UTC | Score: 22/100 | Provider: github

Executive Summary

Severity Count
Critical 0
High 1
Medium 4
Low 113
Top Findings

[CQ-LLM-001] .github/workflows/backend.yml:211 (Duplication · HIGH)

Issue: Repeated configuration blocks for 'cublas' and 'l4t' build types with similar parameters.
Suggestion: Consolidate similar build configurations into a single template to adhere to DRY principles.

- build-type: 'cublas'
  cuda-major-version: "12"
  cuda-minor-version: "8"
  platforms: 'linux/amd64'
  tag-latest: 'auto'
  tag-suffix: '-gpu-nvidia-cuda-12-faster-qwen3-tts'
  runs-on: 'ubuntu-latest'
  base-image: "ubuntu:24.04"
  skip-drivers: 'false'
  backend: "faster-qwen3-tts"
  dockerfile: "./backend/Dockerfile.python"
  context: "./"
  ubuntu-version: '2404'

[CQ-LLM-002] backend/index.yaml:529 (Maintainability · MEDIUM)

Issue: Magic strings used for backend names and URLs without explanation.
Suggestion: Define constants for backend names and URLs to improve maintainability and readability.

name: "faster-qwen3-tts"

[CQ-LLM-003] backend/index.yaml:529 (Documentation · MEDIUM)

Issue: Missing documentation for the new 'faster-qwen3-tts' configuration.
Suggestion: Add comments or docstrings explaining the purpose and usage of the 'faster-qwen3-tts' configuration.

description: |
  Real-time Qwen3-TTS inference using CUDA graph capture. Voice clone only; requires NVIDIA GPU with CUDA.

[CQ-012] backend/python/faster-qwen3-tts/backend.py:184 (Performance · MEDIUM)

Issue: Synchronous I/O call may block the event loop in async context
Suggestion: Use async alternatives (aiofiles, httpx, asyncio.sleep)

time.sleep(_ONE_DAY_IN_SECONDS)

[CQ-012] backend/python/faster-qwen3-tts/test.py:24 (Performance · MEDIUM)

Issue: Synchronous I/O call may block the event loop in async context
Suggestion: Use async alternatives (aiofiles, httpx, asyncio.sleep)

time.sleep(15)

[CQ-008] .github/workflows/backend.yml:214 (Maintainability · LOW)

Issue: Magic number 12 in code
Suggestion: Extract to a named constant

cuda-major-version: "12"

[CQ-008] .github/workflows/backend.yml:218 (Maintainability · LOW)

Issue: Magic number 12 in code
Suggestion: Extract to a named constant

tag-suffix: '-gpu-nvidia-cuda-12-faster-qwen3-tts'

[CQ-008] .github/workflows/backend.yml:220 (Maintainability · LOW)

Issue: Magic number 24 in code
Suggestion: Extract to a named constant

base-image: "ubuntu:24.04"

[CQ-008] .github/workflows/backend.yml:225 (Maintainability · LOW)

Issue: Magic number 2404 in code
Suggestion: Extract to a named constant

ubuntu-version: '2404'

[CQ-008] .github/workflows/backend.yml:592 (Maintainability · LOW)

Issue: Magic number 13 in code
Suggestion: Extract to a named constant

cuda-major-version: "13"

Per-File Breakdown

File Critical High Medium Low Total
.github/workflows/backend.yml 0 1 0 16 17
Makefile 0 0 0 3 3
backend/index.yaml 0 0 2 20 22
backend/python/faster-qwen3-tts/backend.py 0 0 1 24 25
backend/python/faster-qwen3-tts/requirements.txt 0 0 0 1 1
backend/python/faster-qwen3-tts/test.py 0 0 1 49 50

Recommendations

  1. Resolve High severity issues, especially error handling gaps and performance bottlenecks.
  • Run automated tests after applying fixes to verify no regressions.

@chay2199
Copy link
Copy Markdown

chay2199 commented Jun 3, 2026

@codity review

@codity-chait
Copy link
Copy Markdown

codity-chait Bot commented Jun 3, 2026

PR Summary

What Changed

  • Adds a new faster-qwen3-tts gRPC backend for real-time Qwen3-TTS voice cloning with CUDA graph capture.
  • Integrates CUDA 12/13 builds for x86_64 and Jetson L4T (ARM64) into CI/CD pipeline.
  • Registers 9 backend variants in the index with platform-specific dependencies.

Key Changes by Area

Backend Implementation: New gRPC server with LoadModel and TTS methods, enforces CUDA at runtime, supports voice cloning via reference audio/text.

CI/CD: Added CUDA 12/13 builds for linux/amd64 and NVIDIA L4T builds for linux/arm64. Fixed voxcpm CPU build to include ARM64.

Build System: Added faster-qwen3-tts to parallel-safe targets and docker build pipeline.

Files Changed

File Changes Summary
.github/workflows/backend.yml Added CUDA 12/13 builds for x86_64 and Jetson L4T; fixed voxcpm ARM64 build
Makefile Added faster-qwen3-tts to .NOTPARALLEL, test, and docker build targets
backend/index.yaml Registered backend with 9 variants (CUDA 12/13, L4T, dev tags)
backend/python/faster-qwen3-tts/Makefile Standard backend build configuration
backend/python/faster-qwen3-tts/backend.py gRPC server with CUDA enforcement and voice cloning support
backend/python/faster-qwen3-tts/install.sh Installation script with unquoted path variables (see risks)
backend/python/faster-qwen3-tts/requirements-cublas12.txt PyTorch + faster-qwen3-tts for CUDA 12
backend/python/faster-qwen3-tts/requirements-cublas13.txt PyTorch + faster-qwen3-tts for CUDA 13
backend/python/faster-qwen3-tts/requirements-l4t12.txt Jetson L4T 12.0+ dependencies
backend/python/faster-qwen3-tts/requirements-l4t13.txt Jetson L4T 13.0+ dependencies
backend/python/faster-qwen3-tts/requirements.txt Base dependencies (grpcio, soundfile, sox)
backend/python/faster-qwen3-tts/run.sh Runtime wrapper with unquoted paths, missing set -e
backend/python/faster-qwen3-tts/test.py Unit tests with CUDA-gated TTS test
backend/python/faster-qwen3-tts/test.sh Test runner script

Review Focus Areas

  • Path quoting in shell scripts (install.sh:6-11, run.sh:2-7) - unquoted variables risk breakage with spaces.
  • Missing set -e in run.sh - errors in sourced scripts are silently ignored.
  • Redundant language check in backend.py:110-113 - not language already covers empty strings.

Architecture

Design Decisions: Voice cloning only (no base voice generation). CUDA enforcement at runtime rather than build time allows single image with runtime detection. Platform-specific requirements files isolate Jetson wheel dependencies.

Risks: Shell scripts have unquoted path variables (intentional? should be fixed). run.sh lacks set -e so failures propagate silently. Backend is GPU-only by design - no CPU fallback.

Merge Status

NOT MERGEABLE — PR Score 33/100, below threshold (50)

  • [H4] PR quality score (33) is below merge floor (50)
  • [H6] Code quality raw score (22) is below merge floor (40)

@codity-chait
Copy link
Copy Markdown

codity-chait Bot commented Jun 3, 2026

Workflow Diagrams

Automatically generated sequence diagrams showing the workflows in this PR

1. Workflow Analysis

Complex complexity • Components: faster-qwen3-tts gRPC backend, GitHub Actions CI/CD workflow, Docker build system

sequenceDiagram
    title: Faster Qwen3 TTS Backend Integration Workflow

    participant Dev as Developer
    participant GH as GitHub Actions
    participant Docker as Docker Build
    participant Make as Makefile
    participant Backend as faster_qwen3_tts
    participant LocalAI as LocalAI Core
    participant HF as HuggingFace

    Dev->>Make: Add faster-qwen3-tts to NOTPARALLEL list
    Make->>Make: Register backend metadata<br/>BACKEND_FASTER_QWEN3_TTS

    Dev->>GH: Push to faster-qwen3-tts branch
    GH->>GH: Trigger backend.yml workflow

    par CUDA 12 Build
        GH->>Docker: Build cublas CUDA 12<br/>linux/amd64
        Docker->>Backend: Install requirements-cublas12.txt
    and CUDA 13 Build
        GH->>Docker: Build cublas CUDA 13<br/>linux/amd64
        Docker->>Backend: Install requirements-cublas13.txt
    and L4T CUDA 12 Build
        GH->>Docker: Build l4t CUDA 12<br/>linux/arm64
        Docker->>Backend: Install requirements-l4t12.txt
    and L4T CUDA 13 Build
        GH->>Docker: Build l4t CUDA 13<br/>linux/arm64
        Docker->>Backend: Install requirements-l4t13.txt
    end

    Docker->>Backend: Run install.sh
    Backend->>HF: Download Qwen/Qwen3-TTS-12Hz-0.6B-Base

    Note over Backend: CUDA graph capture enabledbr/Voice clone only mode

    GH->>Docker: Tag and push images
    Docker->>GH: Publish to quay.io/localai-backends

    Dev->>Make: make docker-build-faster-qwen3-tts
    Make->>Docker: Build local-ai-backend:faster-qwen3-tts
    Docker->>Make: Save as backend-images/faster-qwen3-tts.tar
    Make->>LocalAI: Install via ocifile protocol

    LocalAI->>Backend: Initialize gRPC service
    Backend->>Backend: Check torch.cuda.is_available

    alt CUDA available
        Backend->>Backend: Load model with CUDA graph
    else CUDA not available
        Backend->>Backend: Exit with error
    end

    LocalAI->>Backend: TTS request via backend_pb2
    Backend->>Backend: Generate audio with soundfile
    Backend->>LocalAI: Return audio data

    Note over Backend,LocalAI: Backend registered inbr/backend/index.yaml forbr/multiple hardware variants
Loading

Note: Diagrams show detected patterns only. Complex workflows may require manual review.

Comment on lines +55 to +65
key, value = opt.split(":", 1)
if is_float(value):
value = float(value)
elif is_int(value):
value = int(value)
elif value.lower() in ["true", "false"]:
value = value.lower() == "true"
self.options[key] = value

model_path = request.Model or "Qwen/Qwen3-TTS-12Hz-0.6B-Base"
self.audio_path = request.AudioPath if hasattr(request, 'AudioPath') and request.AudioPath else None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functional High

The option-parsing loop applies is_float before is_int. Since is_float succeeds for any valid integer string (e.g. "1" parses to 1.0), integer-valued options such as chunk_size will be stored as float instead of int. Later, int(chunk_size) is called defensively, but ref_text or other string options that happen to be purely numeric will also be silently mistyped. More critically, this means an integer option like chunk_size:512 becomes 512.0 (float), and if the upstream generate_voice_clone is strict about type, it may raise a TypeError at runtime.

Suggested fix
            if is_int(value):
                value = int(value)
            elif is_float(value):
                value = float(value)
            elif value.lower() in ["true", "false"]:
                value = value.lower() == "true"
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/python/faster-qwen3-tts/backend.py
Lines: 55-65
Issue Type: functional-high
Severity: high

Issue Description:
The option-parsing loop applies `is_float` before `is_int`. Since `is_float` succeeds for any valid integer string (e.g. `"1"` parses to `1.0`), integer-valued options such as `chunk_size` will be stored as `float` instead of `int`. Later, `int(chunk_size)` is called defensively, but `ref_text` or other string options that happen to be purely numeric will also be silently mistyped. More critically, this means an integer option like `chunk_size:512` becomes `512.0` (float), and if the upstream `generate_voice_clone` is strict about type, it may raise a `TypeError` at runtime.

Current Code:
            if is_float(value):
                value = float(value)
            elif is_int(value):
                value = int(value)
            elif value.lower() in ["true", "false"]:
                value = value.lower() == "true"

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue

Comment on lines +100 to +105
return backend_pb2.Result(
success=False,
message="dst (output path) is required"
)
text = request.text.strip()
if not text:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Robustness High

If TTS is called before LoadModel (or if LoadModel failed after the CUDA check but before setting self.audio_path, self.model_file, self.model_path, or self.options), accessing self.audio_path / self.options etc. in _get_ref_audio_path and the rest of TTS will raise an AttributeError, crashing the gRPC handler and returning no structured error to the caller. The server continues running but that request is silently aborted. A guard check for model readiness is needed.

Suggested fix
    def TTS(self, request, context):
        try:
            if not hasattr(self, 'model') or self.model is None:
                return backend_pb2.Result(
                    success=False,
                    message="Model not loaded. Call LoadModel first."
                )
            if not request.dst:
                return backend_pb2.Result(
                    success=False,
                    message="dst (output path) is required"
                )
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/python/faster-qwen3-tts/backend.py
Lines: 100-105
Issue Type: robustness-high
Severity: high

Issue Description:
If `TTS` is called before `LoadModel` (or if `LoadModel` failed after the CUDA check but before setting `self.audio_path`, `self.model_file`, `self.model_path`, or `self.options`), accessing `self.audio_path` / `self.options` etc. in `_get_ref_audio_path` and the rest of `TTS` will raise an `AttributeError`, crashing the gRPC handler and returning no structured error to the caller. The server continues running but that request is silently aborted. A guard check for model readiness is needed.

Current Code:
    def TTS(self, request, context):
        try:
            if not request.dst:
                return backend_pb2.Result(
                    success=False,
                    message="dst (output path) is required"
                )

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue

audio_list, sr = self.model.generate_voice_clone(
text=text,
language=language,
ref_audio=ref_audio,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Medium

The output file path request.dst is passed directly to sf.write() without validation. A caller can write audio output to any filesystem location the process has access to (e.g. /etc/cron.d/backdoor.wav, overwriting existing files, or writing outside the intended temp directory). The path should be validated to fall within the expected output directory before use.

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/python/faster-qwen3-tts/backend.py
Lines: 138-138
Issue Type: security-medium
Severity: medium

Issue Description:
The output file path `request.dst` is passed directly to `sf.write()` without validation. A caller can write audio output to any filesystem location the process has access to (e.g. `/etc/cron.d/backdoor.wav`, overwriting existing files, or writing outside the intended temp directory). The path should be validated to fall within the expected output directory before use.

Current Code:
            sf.write(request.dst, audio_data, sr)

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue

Comment on lines +1 to +8
grpcio==1.71.0
protobuf
certifi
packaging==24.1
soundfile
setuptools
six
sox
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maintainability Medium

All dependency versions in requirements.txt are unpinned except grpcio==1.71.0 and packaging==24.1. This means protobuf, certifi, soundfile, setuptools, six, and sox will resolve to whatever is latest at install time, potentially breaking the backend silently on a future pip install if a breaking release is published. The CUDA-specific requirements files also leave torch, torchaudio, and faster-qwen3-tts fully unpinned. For a production backend image, pinning is strongly recommended.

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/python/faster-qwen3-tts/requirements.txt
Lines: 1-8
Issue Type: maintainability-medium
Severity: medium

Issue Description:
All dependency versions in `requirements.txt` are unpinned except `grpcio==1.71.0` and `packaging==24.1`. This means `protobuf`, `certifi`, `soundfile`, `setuptools`, `six`, and `sox` will resolve to whatever is latest at install time, potentially breaking the backend silently on a future pip install if a breaking release is published. The CUDA-specific requirements files also leave `torch`, `torchaudio`, and `faster-qwen3-tts` fully unpinned. For a production backend image, pinning is strongly recommended.

Current Code:
grpcio==1.71.0
protobuf
certifi
packaging==24.1
soundfile
setuptools
six
sox

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue

Comment on lines +22 to +23
cwd=os.path.dirname(os.path.abspath(__file__)),
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Robustness Low

The test setUp unconditionally sleeps 15 seconds waiting for the backend process to start. If the process fails immediately (e.g. import error, port conflict), the test suite still waits the full 15 seconds and then gets cryptic gRPC connection-refused errors rather than a clear failure. A readiness poll with a short sleep-and-retry loop would both speed up the common case and give a clearer failure message.

Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/python/faster-qwen3-tts/test.py
Lines: 22-23
Issue Type: robustness-low
Severity: low

Issue Description:
The test `setUp` unconditionally sleeps 15 seconds waiting for the backend process to start. If the process fails immediately (e.g. import error, port conflict), the test suite still waits the full 15 seconds and then gets cryptic gRPC connection-refused errors rather than a clear failure. A readiness poll with a short sleep-and-retry loop would both speed up the common case and give a clearer failure message.

Current Code:
        self.service = subprocess.Popen(
            ["python3", "backend.py", "--addr", "localhost:50052"],

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue

Comment on lines +46 to +57
CUDA=True,
),
timeout=10.0,
)
self.assertFalse(response.success)

@unittest.skipUnless(
__import__("torch").cuda.is_available(),
"faster-qwen3-tts TTS requires CUDA",
)
def test_tts(self):
import soundfile as sf
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functional Medium

test_load_model_requires_cuda asserts self.assertFalse(response.success) unconditionally. On a CI runner that does have a GPU, LoadModel will succeed and the assertion will fail, breaking the test suite on GPU-enabled hosts. The test intent is to confirm behaviour when CUDA is absent; it should be skipped when CUDA is available.

Suggested fix
    @unittest.skipIf(
        __import__("torch").cuda.is_available(),
        "Skipping CUDA-required test on GPU-enabled host",
    )
    def test_load_model_requires_cuda(self):
        with grpc.insecure_channel("localhost:50052") as channel:
            stub = backend_pb2_grpc.BackendStub(channel)
            response = stub.LoadModel(
                backend_pb2.ModelOptions(
                    Model="Qwen/Qwen3-TTS-12Hz-0.6B-Base",
                    CUDA=True,
                ),
                timeout=10.0,
            )
        self.assertFalse(response.success)
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/python/faster-qwen3-tts/test.py
Lines: 46-57
Issue Type: functional-medium
Severity: medium

Issue Description:
`test_load_model_requires_cuda` asserts `self.assertFalse(response.success)` unconditionally. On a CI runner that *does* have a GPU, `LoadModel` will succeed and the assertion will fail, breaking the test suite on GPU-enabled hosts. The test intent is to confirm behaviour when CUDA is *absent*; it should be skipped when CUDA is available.

Current Code:
    def test_load_model_requires_cuda(self):
        with grpc.insecure_channel("localhost:50052") as channel:
            stub = backend_pb2_grpc.BackendStub(channel)
            response = stub.LoadModel(
                backend_pb2.ModelOptions(
                    Model="Qwen/Qwen3-TTS-12Hz-0.6B-Base",
                    CUDA=True,
                ),
                timeout=10.0,
            )
        self.assertFalse(response.success)

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue

Comment thread backend/index.yaml
Comment on lines 2301 to +2305
uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-qwen-tts"
mirrors:
- localai/localai-backends:master-metal-darwin-arm64-qwen-tts
## faster-qwen3-tts
- !!merge <<: *faster-qwen3-tts
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functional Medium

The faster-qwen3-tts-development entry (the first merged block after the anchor) overrides capabilities but does NOT set a uri or mirrors. This means the development variant has no image URI; when LocalAI tries to resolve and pull this backend image for a development/master install, it will find no image reference and fail silently or with a confusing error. The qwen-tts-development entry and others in the file all carry explicit uri + mirrors pointing to master-* tagged images.

Suggested fix
- !!merge <<: *faster-qwen3-tts
  name: "faster-qwen3-tts-development"
  uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-faster-qwen3-tts"
  mirrors:
    - localai/localai-backends:master-gpu-nvidia-cuda-12-faster-qwen3-tts
  capabilities:
    nvidia: "cuda12-faster-qwen3-tts-development"
    default: "cuda12-faster-qwen3-tts-development"
    nvidia-cuda-13: "cuda13-faster-qwen3-tts-development"
    nvidia-cuda-12: "cuda12-faster-qwen3-tts-development"
    nvidia-l4t: "nvidia-l4t-faster-qwen3-tts-development"
    nvidia-l4t-cuda-12: "nvidia-l4t-faster-qwen3-tts-development"
    nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-faster-qwen3-tts-development"
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/index.yaml
Lines: 2301-2305
Issue Type: functional-medium
Severity: medium

Issue Description:
The `faster-qwen3-tts-development` entry (the first merged block after the anchor) overrides `capabilities` but does NOT set a `uri` or `mirrors`. This means the development variant has no image URI; when LocalAI tries to resolve and pull this backend image for a development/master install, it will find no image reference and fail silently or with a confusing error. The `qwen-tts-development` entry and others in the file all carry explicit `uri` + `mirrors` pointing to `master-*` tagged images.

Current Code:
- !!merge <<: *faster-qwen3-tts
  name: "faster-qwen3-tts-development"
  capabilities:
    nvidia: "cuda12-faster-qwen3-tts-development"
    default: "cuda12-faster-qwen3-tts-development"
    nvidia-cuda-13: "cuda13-faster-qwen3-tts-development"
    nvidia-cuda-12: "cuda12-faster-qwen3-tts-development"
    nvidia-l4t: "nvidia-l4t-faster-qwen3-tts-development"
    nvidia-l4t-cuda-12: "nvidia-l4t-faster-qwen3-tts-development"
    nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-faster-qwen3-tts-development"

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue

@codity-chait
Copy link
Copy Markdown

codity-chait Bot commented Jun 3, 2026

Nitpicks (Low Priority)

Found 3 low-priority suggestions for code improvement

Click to expand nitpicks

backend/python/faster-qwen3-tts/backend.py (lines 110-113)

Functional Low

The language check if not language or language == "" is redundant. If language is already None or an empty string "", the first condition not language is already True for both; the second language == "" is never independently needed. This is a minor logic smell with no functional impact.

Code Suggestion or Comments
language = request.language if hasattr(request, 'language') and request.language else None
            if not language:
                language = "English"
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/python/faster-qwen3-tts/backend.py
Lines: 110-113
Issue Type: functional-low
Severity: low

Issue Description:
The `language` check `if not language or language == ""` is redundant. If `language` is already `None` or an empty string `""`, the first condition `not language` is already `True` for both; the second `language == ""` is never independently needed. This is a minor logic smell with no functional impact.

Current Code:
            language = request.language if hasattr(request, 'language') and request.language else None
            if not language or language == "":
                language = "English"

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---



backend/python/faster-qwen3-tts/install.sh (lines 6-11)

Robustness Low

The backend_dir variable and its uses in the if/else branches are unquoted: if [ -d $backend_dir/common ] and source $backend_dir/common/libbackend.sh. If the path contains spaces (common in some CI/developer setups), word splitting will break the directory check and the source command. All variable expansions should be double-quoted.

Code Suggestion or Comments
backend_dir=$(dirname "$0")
if [ -d "$backend_dir/common" ]; then
    source "$backend_dir/common/libbackend.sh"
else
    source "$backend_dir/../common/libbackend.sh"
fi
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/python/faster-qwen3-tts/install.sh
Lines: 6-11
Issue Type: robustness-low
Severity: low

Issue Description:
The `backend_dir` variable and its uses in the `if/else` branches are unquoted: `if [ -d $backend_dir/common ]` and `source $backend_dir/common/libbackend.sh`. If the path contains spaces (common in some CI/developer setups), word splitting will break the directory check and the source command. All variable expansions should be double-quoted.

Current Code:
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
    source $backend_dir/common/libbackend.sh
else
    source $backend_dir/../common/libbackend.sh
fi

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---



backend/python/faster-qwen3-tts/run.sh (lines 2-7)

Robustness Low

Same unquoted-variable issue as install.sh: dirname $0, $backend_dir/common, and $backend_dir/../common/libbackend.sh are all unquoted. Paths with spaces will break. Also run.sh is missing set -e (present in install.sh and test.sh) so errors in sourced scripts are silently ignored.

Code Suggestion or Comments
#!/bin/bash
set -e
backend_dir=$(dirname "$0")
if [ -d "$backend_dir/common" ]; then
    source "$backend_dir/common/libbackend.sh"
else
    source "$backend_dir/../common/libbackend.sh"
fi
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert bash developer with deep knowledge of security, performance, and best practices.

### Context

File: backend/python/faster-qwen3-tts/run.sh
Lines: 2-7
Issue Type: robustness-low
Severity: low

Issue Description:
Same unquoted-variable issue as `install.sh`: `dirname $0`, `$backend_dir/common`, and `$backend_dir/../common/libbackend.sh` are all unquoted. Paths with spaces will break. Also `run.sh` is missing `set -e` (present in `install.sh` and `test.sh`) so errors in sourced scripts are silently ignored.

Current Code:
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
    source $backend_dir/common/libbackend.sh
else
    source $backend_dir/../common/libbackend.sh
fi

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow bash best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---



Like Dislike

@codity-chait
Copy link
Copy Markdown

codity-chait Bot commented Jun 3, 2026

Security Scan Summary

Metric Value
Vulnerabilities Critical: 0
Overall Risk Clean
Files Scanned 14

No critical security issues detected

Scan completed in 15.1s

Security scan powered by Codity.ai

@codity-chait
Copy link
Copy Markdown

codity-chait Bot commented Jun 3, 2026

License Compliance Scan

Metric Value
Packages Scanned 337
High Risk (Strong Copyleft) 0
Medium Risk (Weak Copyleft) 4
Low Risk (Permissive) 271
Unknown License 62

Weak copyleft licenses found - verify compatibility

Some packages have unknown licenses - manual review required

Medium Risk Licenses - 4 packages

MPL-2.0 (4 packages):

  • github.com/libp2p/go-yamux/v5 5.0.1
  • github.com/hashicorp/golang-lru/v2 2.0.7
  • github.com/hashicorp/golang-lru 1.0.2
  • github.com/shoenig/go-m1cpu 0.1.6
Unknown Licenses - 62 packages
  • github.com/go-openapi/jsonreference 0.21.0
  • github.com/go-openapi/swag 0.23.0
  • github.com/go-openapi/spec 0.21.0
  • github.com/gogo/protobuf 1.3.2
  • github.com/google/go-cmp 0.7.0
  • github.com/golang/groupcache 0.0.0-20210331224755-41bb18bfe9da
  • github.com/google/btree 1.1.3
  • github.com/golang/snappy 0.0.4
  • github.com/google/gopacket 1.1.19
  • github.com/google/pprof 0.0.0-20260115054156-294ebfa9ad83
  • github.com/gorilla/css 1.0.1
  • github.com/gorilla/websocket 1.5.3
  • github.com/huin/goupnp 1.3.0
  • github.com/jaypipes/pcidb 1.1.1
  • github.com/klauspost/compress 1.18.1
  • github.com/libp2p/go-netroute 0.2.2
  • github.com/libp2p/go-reuseport 0.4.0
  • github.com/lufia/plan9stats 0.0.0-20250317134145-8bc96cf8fc35
  • github.com/marten-seemann/tcp 0.0.0-20210406111302-dfbc87cc63fd
  • github.com/moby/sys/sequential 0.6.0

...and 42 more

Powered by Codity.ai · Docs

@codity-chait
Copy link
Copy Markdown

codity-chait Bot commented Jun 3, 2026

Code Quality Report — test-org-codity/LocalAI · PR #9

Scanned: 2026-06-03 23:26 UTC | Score: 22/100 | Provider: github

Executive Summary

Severity Count
Critical 0
High 1
Medium 4
Low 113
Top Findings

[CQ-LLM-001] .github/workflows/backend.yml:211 (Duplication · HIGH)

Issue: Repeated configuration blocks for 'cublas' and 'l4t' build types with similar parameters.
Suggestion: Consolidate similar build configurations into a single template or parameterized structure to adhere to DRY principles.

- build-type: 'cublas'
  cuda-major-version: "12"
  cuda-minor-version: "8"
  platforms: 'linux/amd64'
  tag-latest: 'auto'
  tag-suffix: '-gpu-nvidia-cuda-12-faster-qwen3-tts'
  runs-on: 'ubuntu-latest'
  base-image: "ubuntu:24.04"
  skip-drivers: 'false'
  backend: "faster-qwen3-tts"
  dockerfile: "./backend/Dockerfile.python"
  context: "./"
  ubuntu-version: '2404'

[CQ-LLM-002] backend/index.yaml:529 (Maintainability · MEDIUM)

Issue: Use of magic strings for backend names and URLs without explanation.
Suggestion: Define constants or use a configuration file to manage these values, improving readability and maintainability.

name: "faster-qwen3-tts"

[CQ-LLM-003] backend/index.yaml:529 (Documentation · MEDIUM)

Issue: Missing documentation for the new backend configuration for 'faster-qwen3-tts'.
Suggestion: Add comments or documentation to explain the purpose and usage of the new backend configuration.

description: |
  Real-time Qwen3-TTS inference using CUDA graph capture. Voice clone only; requires NVIDIA GPU with CUDA.

[CQ-012] backend/python/faster-qwen3-tts/backend.py:184 (Performance · MEDIUM)

Issue: Synchronous I/O call may block the event loop in async context
Suggestion: Use async alternatives (aiofiles, httpx, asyncio.sleep)

time.sleep(_ONE_DAY_IN_SECONDS)

[CQ-012] backend/python/faster-qwen3-tts/test.py:24 (Performance · MEDIUM)

Issue: Synchronous I/O call may block the event loop in async context
Suggestion: Use async alternatives (aiofiles, httpx, asyncio.sleep)

time.sleep(15)

[CQ-LLM-004] .github/workflows/backend.yml:211 (Style · LOW)

Issue: Inconsistent formatting in YAML configuration.
Suggestion: Ensure consistent indentation and formatting across all YAML files for better readability.

tag-suffix: '-gpu-nvidia-cuda-12-faster-qwen3-tts'

[CQ-008] .github/workflows/backend.yml:214 (Maintainability · LOW)

Issue: Magic number 12 in code
Suggestion: Extract to a named constant

cuda-major-version: "12"

[CQ-008] .github/workflows/backend.yml:218 (Maintainability · LOW)

Issue: Magic number 12 in code
Suggestion: Extract to a named constant

tag-suffix: '-gpu-nvidia-cuda-12-faster-qwen3-tts'

[CQ-008] .github/workflows/backend.yml:220 (Maintainability · LOW)

Issue: Magic number 24 in code
Suggestion: Extract to a named constant

base-image: "ubuntu:24.04"

[CQ-008] .github/workflows/backend.yml:225 (Maintainability · LOW)

Issue: Magic number 2404 in code
Suggestion: Extract to a named constant

ubuntu-version: '2404'

Per-File Breakdown

File Critical High Medium Low Total
.github/workflows/backend.yml 0 1 0 17 18
Makefile 0 0 0 2 2
backend/index.yaml 0 0 2 20 22
backend/python/faster-qwen3-tts/backend.py 0 0 1 24 25
backend/python/faster-qwen3-tts/requirements.txt 0 0 0 1 1
backend/python/faster-qwen3-tts/test.py 0 0 1 49 50

Recommendations

  1. Resolve High severity issues, especially error handling gaps and performance bottlenecks.
  • Run automated tests after applying fixes to verify no regressions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants