🛡️ Sentinel: [HIGH] Fix exception information leakage#497
Conversation
- Prevent internal Python exception strings (`str(error)`) from being propagated over multiprocessing queues. - Map specific exceptions (FileNotFoundError, ValueError, RuntimeError) to safe, generic strings. - Add `logger.error(..., exc_info=True)` to internally log full stack traces instead of dropping or exposing them. - Ensure API boundary does not leak server paths or decoding states to clients.
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
There was a problem hiding this comment.
Pull request overview
OpenCode reviewed the current-head evidence but cannot approve because required coverage evidence did not pass.
Findings
1. HIGH .github/workflows/opencode-review.yml:1 - Coverage evidence did not prove 100% test and docstring coverage
-
Problem: The OpenCode approval path reached an APPROVE control result while the separate coverage-evidence job result was
failure. -
Root cause: Automated approval is only valid when the same-head coverage-evidence job proves both test coverage and docstring coverage at 100%, or reports not applicable because no supported source files or package manifests exist. Missing, partial, failed, skipped, unavailable, unsupported-tooling, or below-100 evidence is a blocker.
-
Fix: Install or configure the repository coverage/docstring coverage tooling when source files or package manifests exist, rerun the current-head coverage-evidence job, and approve only after it reports
successwith 100% coverage or explicit no-source not-applicable evidence. -
Regression test: Keep the approval branch checking
needs.coverage-evidence.result == successbefore posting APPROVE. -
Result: REQUEST_CHANGES
-
Reason: coverage-evidence result was
failure, so 100% current-head test/docstring coverage was not proven for87bed0f0b69be63fe797b1ae1e6e7c6ef7bec52b. -
Head SHA:
87bed0f0b69be63fe797b1ae1e6e7c6ef7bec52b -
Workflow run: 28355182139
-
Workflow attempt: 1
Coverage evidence
Coverage Evidence
- Head SHA:
87bed0f0b69be63fe797b1ae1e6e7c6ef7bec52b - Coverage policy: current-head test coverage and docstring coverage must prove 100% before automated approval.
- Approval policy: missing, partial, unavailable, failed, or below-100% coverage evidence is blocking.
Python project dependencies (services/analysis-engine)
Using CPython 3.12.3 interpreter at: /usr/bin/python3.12
Creating virtual environment at: services/analysis-engine/.venv
Resolved 49 packages in 0.69ms
Building bandscope-analysis @ file:///home/runner/work/bandscope/bandscope/pr-head/services/analysis-engine
Downloading soundfile (1.3MiB)
Downloading scipy (33.6MiB)
Downloading pygments (1.2MiB)
Downloading ruff (10.7MiB)
Downloading mypy (13.0MiB)
Downloading numba (3.6MiB)
Downloading yt-dlp (3.0MiB)
Downloading numpy (15.8MiB)
Downloading scikit-learn (8.5MiB)
Downloading llvmlite (53.7MiB)
Downloaded soundfile
Downloaded pygments
Built bandscope-analysis @ file:///home/runner/work/bandscope/bandscope/pr-head/services/analysis-engine
Downloaded numba
Downloaded ruff
Downloaded yt-dlp
Downloaded scikit-learn
Downloaded numpy
Downloaded scipy
Downloaded llvmlite
Downloaded mypy
Prepared 44 packages in 2.15s
Installed 44 packages in 74ms
+ audioread==3.1.0
+ bandit==1.9.4
+ bandscope-analysis==0.1.0 (from file:///home/runner/work/bandscope/bandscope/pr-head/services/analysis-engine)
+ certifi==2026.2.25
+ cffi==2.0.0
+ charset-normalizer==3.4.6
+ coverage==7.13.4
+ decorator==5.2.1
+ idna==3.18
+ iniconfig==2.3.0
+ joblib==1.5.3
+ lazy-loader==0.5
+ librosa==0.11.0
+ librt==0.8.1
+ llvmlite==0.45.1
+ markdown-it-py==4.0.0
+ mdurl==0.1.2
+ msgpack==1.2.1
+ mypy==1.19.1
+ mypy-extensions==1.1.0
+ numba==0.62.1
+ numpy==2.3.5
+ packaging==26.0
+ pathspec==1.0.4
+ platformdirs==4.9.4
+ pluggy==1.6.0
+ pooch==1.9.0
+ pycparser==3.0
+ pygments==2.20.0
+ pytest==9.0.3
+ pytest-cov==7.0.0
+ pyyaml==6.0.3
+ requests==2.33.0
+ rich==15.0.0
+ ruff==0.15.5
+ scikit-learn==1.8.0
+ scipy==1.17.1
+ soundfile==0.13.1
+ soxr==1.0.0
+ stevedore==5.7.0
+ threadpoolctl==3.6.0
+ typing-extensions==4.15.0
+ urllib3==2.7.0
+ yt-dlp==2026.6.9
- Result: PASS
Python test coverage (services/analysis-engine)
============================= test session starts ==============================
platform linux -- Python 3.12.3, pytest-9.0.3, pluggy-1.6.0
rootdir: /home/runner/work/bandscope/bandscope/pr-head/services/analysis-engine
configfile: pyproject.toml
plugins: cov-7.0.0
collected 441 items
tests/test_activity.py ........ [ 1%]
tests/test_anchors.py .... [ 2%]
tests/test_api.py ......................... [ 8%]
tests/test_chord_recognizer.py .................... [ 12%]
tests/test_chords.py ......................... [ 18%]
tests/test_cli.py ................. [ 22%]
tests/test_extractor.py ...... [ 23%]
tests/test_health.py . [ 24%]
tests/test_pipeline_integration.py ......... [ 26%]
tests/test_pitch_tracker.py ............... [ 29%]
tests/test_priority.py ....... [ 31%]
tests/test_ranges.py ................... [ 35%]
tests/test_release_asset_selection.py ........ [ 37%]
tests/test_release_metadata.py ....... [ 38%]
tests/test_release_packaging.py ......... [ 40%]
tests/test_roles.py ....... [ 42%]
tests/test_roles_ml.py ... [ 43%]
tests/test_segmenter.py ..................... [ 47%]
tests/test_separation.py .................................. [ 55%]
tests/test_supply_chain_policy.py ...................................... [ 64%]
........................................................................ [ 80%]
..................................................... [ 92%]
tests/test_temporal.py ......... [ 94%]
tests/test_transcription.py ... [ 95%]
tests/test_tuning.py ..... [ 96%]
tests/test_youtube.py ................ [100%]
=============================== warnings summary ===============================
tests/test_pipeline_integration.py::test_pipeline_without_detected_sections_falls_back
tests/test_roles.py::test_role_extractor_falls_back_when_activity_detection_fails
/home/runner/work/bandscope/bandscope/pr-head/services/analysis-engine/.venv/lib/python3.12/site-packages/librosa/core/pitch.py:103: UserWarning: Trying to estimate tuning from empty frequency set.
return pitch_tuning(
tests/test_roles.py::test_role_extractor_falls_back_when_activity_detection_fails
/home/runner/work/bandscope/bandscope/pr-head/services/analysis-engine/.venv/lib/python3.12/site-packages/librosa/core/spectrum.py:266: UserWarning: n_fft=2048 is too large for input signal of length=100
warnings.warn(
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================ tests coverage ================================
_______________ coverage: platform linux, python 3.12.3-final-0 ________________
Name Stmts Miss Cover Missing
------------------------------------------------------------------------------------
src/bandscope_analysis/__init__.py 3 0 100%
src/bandscope_analysis/api.py 578 0 100%
src/bandscope_analysis/chords/__init__.py 5 0 100%
src/bandscope_analysis/chords/analyzer.py 116 0 100%
src/bandscope_analysis/chords/capo.py 10 0 100%
src/bandscope_analysis/chords/chord_recognizer.py 192 0 100%
src/bandscope_analysis/chords/model.py 15 0 100%
src/bandscope_analysis/cli.py 68 0 100%
src/bandscope_analysis/health.py 7 0 100%
src/bandscope_analysis/ranges/__init__.py 4 0 100%
src/bandscope_analysis/ranges/analyzer.py 77 0 100%
src/bandscope_analysis/ranges/model.py 19 0 100%
src/bandscope_analysis/ranges/pitch_tracker.py 54 0 100%
src/bandscope_analysis/roles/__init__.py 4 0 100%
src/bandscope_analysis/roles/activity.py 59 0 100%
src/bandscope_analysis/roles/extractor.py 118 0 100%
src/bandscope_analysis/roles/model.py 58 0 100%
src/bandscope_analysis/roles/priority.py 13 0 100%
src/bandscope_analysis/roles/tuning.py 11 0 100%
src/bandscope_analysis/sections/__init__.py 6 0 100%
src/bandscope_analysis/sections/anchors.py 5 0 100%
src/bandscope_analysis/sections/extractor.py 38 0 100%
src/bandscope_analysis/sections/model.py 35 0 100%
src/bandscope_analysis/sections/segmenter.py 140 0 100%
src/bandscope_analysis/sections/utils.py 8 0 100%
src/bandscope_analysis/separation/__init__.py 4 0 100%
src/bandscope_analysis/separation/audio_separator.py 145 0 100%
src/bandscope_analysis/separation/model.py 31 0 100%
src/bandscope_analysis/separation/separator.py 34 0 100%
src/bandscope_analysis/temporal/__init__.py 3 0 100%
src/bandscope_analysis/temporal/analyzer.py 49 0 100%
src/bandscope_analysis/temporal/model.py 9 0 100%
src/bandscope_analysis/transcription/__init__.py 2 0 100%
src/bandscope_analysis/transcription/api.py 11 0 100%
src/bandscope_analysis/youtube.py 81 0 100%
------------------------------------------------------------------------------------
TOTAL 2012 0 100%
Required test coverage of 100% reached. Total coverage: 100.00%
================== 441 passed, 3 warnings in 90.04s (0:01:30) ==================
- Result: PASS
Python docstring coverage
- Result: DEFERRED
- Reason: package.json defines check:python-docstrings; repository-owned docstring coverage runs after package dependency setup.
JavaScript/TypeScript dependencies (npm ci)
added 272 packages, and audited 275 packages in 7s
71 packages are looking for funding
run `npm fund` for details
found 0 vulnerabilities
- Result: PASS
Repository docstring coverage
> bandscope@0.1.3 check:python-docstrings
> sh -c 'cd services/analysis-engine && uv run ruff check src tests ../../scripts --select D100,D101,D102,D103,D104,D105,D106,D107'
All checks passed!
- Result: PASS
JavaScript/TypeScript test coverage
> bandscope@0.1.3 test
> npm run test --workspaces --if-present && sh -c 'cd services/analysis-engine && uv run pytest tests --cov=src/bandscope_analysis --cov-report=term-missing --cov-fail-under=100' --coverage
> @bandscope/desktop@0.1.0 test
> node -e "require('node:fs').mkdirSync('coverage/.tmp', { recursive: true })" && vitest run --coverage
�[1m�[30m�[46m RUN �[49m�[39m�[22m �[36mv4.1.9 �[39m�[90m/home/runner/work/bandscope/bandscope/pr-head/apps/desktop�[39m
�[2mCoverage enabled with �[22m�[33mv8�[39m
�[32m✓�[39m src/lib/export.test.ts �[2m(�[22m�[2m16 tests�[22m�[2m)�[22m�[32m 18�[2mms�[22m�[39m
�[32m✓�[39m src/lib/analysis.test.ts �[2m(�[22m�[2m14 tests�[22m�[2m)�[22m�[32m 49�[2mms�[22m�[39m
�[32m✓�[39m src/features/workspace/Workspace.test.tsx �[2m(�[22m�[2m11 tests�[22m�[2m)�[22m�[33m 1727�[2mms�[22m�[39m
�[33m�[2m✓�[22m�[39m enables bass transcription from selected role metadata rather than role id text �[33m 388�[2mms�[22m�[39m
�[32m✓�[39m src/components/ui/ui-primitives.test.tsx �[2m(�[22m�[2m7 tests�[22m�[2m)�[22m�[33m 448�[2mms�[22m�[39m
�[33m�[2m✓�[22m�[39m renders only custom progress children when supplied �[33m 323�[2mms�[22m�[39m
�[32m✓�[39m src/features/workspace/RoleSwitcher.test.tsx �[2m(�[22m�[2m4 tests�[22m�[2m)�[22m�[33m 358�[2mms�[22m�[39m
�[90mstderr�[2m | src/App.test.tsx�[2m > �[22m�[2mApp�[2m > �[22m�[2mapplies pushed analysis status updates over the IPC event bridge
�[22m�[39mAn update to App inside a test was not wrapped in act(...).
When testing, code that causes React state updates should be wrapped into act(...):
act(() => {
/* fire events that update state */
## Change Flow DAG
```mermaid
flowchart LR
PR["PR changed files"] --> Evidence["OpenCode bounded evidence"]
Evidence --> S1["Changed file (2 files)"]
S1 --> I1["repository behavior"]
I1 --> R1["Review risk: Changed file (2 files)"]
R1 --> V1["required checks"]
Evidence --> S2["Test: test_api.py"]
S2 --> I2["regression suite"]
I2 --> R2["Review risk: Test: test_api.py"]
R2 --> V2["targeted test run"]
OpenCode Review Overview
Pull request overviewOpenCode cannot approve yet because required coverage evidence did not pass. Review outcome1. HIGH .github/workflows/opencode-review.yml:1 - Coverage evidence did not prove required test/docstring evidence
Coverage evidenceCoverage Evidence
Python project dependencies (services/analysis-engine)
Python coverage with missing-line report (services/analysis-engine)
Python docstring coverage
JavaScript/TypeScript dependencies (npm ci)
Repository docstring coverage
JavaScript/TypeScript test coverage |
There was a problem hiding this comment.
Pull request overview
This PR hardens the analysis engine’s IPC/worker error handling by stopping raw Python exception strings from being propagated to clients, while preserving debuggability via server-side traceback logging.
Changes:
- Replaces
str(error)payloads from_stem_separation_workerwith safe, static error messages per exception type. - Updates
run_analysis_job_updatesto return a generic “Stem separation failed” message for local stem separation failures. - Adjusts unit tests to assert sanitized error messages and adds a Sentinel learning note documenting the pattern.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| services/analysis-engine/src/bandscope_analysis/api.py | Sanitizes error messages returned via queues/API and adds internal logging with exc_info=True. |
| services/analysis-engine/tests/test_api.py | Updates tests to verify generic/safe error payloads from worker and job update paths. |
| .Jules/sentinel.md | Adds a short security note documenting the exception-leakage prevention approach. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
OpenCode reviewed the current-head evidence but found unresolved reviewer or review-agent threads before approval.
Findings
1. HIGH .github/workflows/opencode-review.yml:1 - Unresolved reviewer thread blocks automated approval
- Problem: OpenCode reached an APPROVE control result, but the approval step found unresolved, non-outdated human or review-agent thread evidence on the current pull request.
- Root cause: Reviewer and review-agent feedback can arrive after bounded model evidence is prepared, so the approval step must re-query GitHub immediately before publishing an approval.
- Fix: Address or resolve the listed reviewer thread(s), then re-run OpenCode on the current head.
- Regression test: Keep the approval gate querying reviewThreads(first: 100) after model output and before create_pull_review APPROVE, including bot review agents other than OpenCode itself.
Review thread evidence
Latest unresolved reviewer thread evidence
services/analysis-engine/src/bandscope_analysis/api.py line 844
- Latest reviewer comment: @copilot-pull-request-reviewer at 2026-07-01T10:29:30Z
- Comment URL: #497 (comment)
- Comment excerpt: The log message interpolates the raw exception text (e.g., '{error}'), which can still contain sensitive details like full paths or decoder/library internals. Since 'exc_info=True' already records the traceback, log a static message (or a sanitized filename) and avoid embedding the exception string directly to better align with the worker’s goal of preventing information leakage.
services/analysis-engine/src/bandscope_analysis/api.py line 1093
- Latest reviewer comment: @copilot-pull-request-reviewer at 2026-07-01T10:29:30Z
- Comment URL: #497 (comment)
- Comment excerpt: This log message interpolates the raw exception text ('{error}'), which can contain sensitive details (paths, library internals). Since 'exc_info=True' already captures full diagnostic context, log a static message (or sanitized filename) instead of embedding the exception string.
services/analysis-engine/tests/test_api.py line 872
-
Latest reviewer comment: @copilot-pull-request-reviewer at 2026-07-01T10:29:30Z
-
Comment URL: #497 (comment)
-
Comment excerpt: The assertion branches on ''oom' in str(error)' to distinguish 'RuntimeError' from other exceptions, but the production mapping is based on exception type, not message content. This makes the test brittle (a 'RuntimeError' without 'oom' would incorrectly expect the 'unexpected' message). Prefer branching on 'isinstance(error, RuntimeError)' instead of substring checks.
-
Result: REQUEST_CHANGES
-
Reason: unresolved reviewer or review-agent thread(s) were present before approval.
-
Head SHA:
927ac4368fd811c2f5df46ace3a0a392f463f2af -
Workflow run: 28510887202
-
Workflow attempt: 1
Changed-File Evidence Map
flowchart LR
PR["PR changed files"] --> Evidence["OpenCode bounded evidence"]
Evidence --> S1["Changed file (2 files)"]
S1 --> I1["repository behavior"]
I1 --> R1["Review risk: Changed file (2 files)"]
R1 --> V1["required checks"]
Evidence --> S2["Test: test_api.py"]
S2 --> I2["regression suite"]
I2 --> R2["Review risk: Test: test_api.py"]
R2 --> V2["targeted test run"]
|
Updated the branch to address the Copilot/OpenCode review findings: removed raw exception interpolation from stem-separation logs while retaining |
Acknowledged. |
- Prevent internal Python exception strings (`str(error)`) from being propagated over IPC. - Ensure API boundary does not leak server paths or memory limits (like `oom`) to clients. - Map specific exceptions (FileNotFoundError, ValueError, RuntimeError) to safe, generic strings to avoid CWE-209. - Use `logger.error` with `exc_info=True` instead of completely dropping or exposing error logs so that diagnostics remain visible securely on the server context.
- Prevent internal Python exception strings (`str(error)`) from being propagated over IPC. - Ensure API boundary does not leak server paths or memory limits (like `oom`) to clients. - Map specific exceptions (FileNotFoundError, ValueError, RuntimeError) to safe, generic strings to avoid CWE-209. - Use `logger.error` with `exc_info=True` instead of completely dropping or exposing error logs so that diagnostics remain visible securely on the server context.
|
This PR is superseded by #538, which keeps the exception information leakage fix as a clean two-file diff against #538 removes raw exception strings from worker queue payloads and local-audio failure envelopes, keeps diagnostics in static |
There was a problem hiding this comment.
Pull request overview
OpenCode cannot approve yet because required coverage evidence did not pass.
Review outcome
1. HIGH .github/workflows/opencode-review.yml:1 - Coverage evidence did not prove required test/docstring evidence
-
Problem: The required coverage-evidence job result was
failure, so OpenCode cannot establish approval sufficiency for this head. -
Root cause: Automated approval is only valid when the same-head coverage-evidence job proves supported repository test suites passed and configured docstring gates passed or were advisory, or reports not applicable because no supported source files or package manifests exist. Missing, failed, skipped, unavailable, or unsupported-tooling test evidence is a blocker.
-
Fix: Install or configure the repository test/docstring evidence tooling when source files or package manifests exist, rerun the current-head coverage-evidence job, and approve only after it reports
successwith required evidence or explicit no-source not-applicable evidence. -
Regression test: Keep the approval branch checking
needs.coverage-evidence.result == successbefore posting APPROVE, and publish REQUEST_CHANGES when coverage-evidence blocker states such as cancelled, skipped, failed, unsupported-tooling, or below-100 evidence are present. -
Result: REQUEST_CHANGES
-
Reason: coverage-evidence result was
failure, so required test/docstring evidence was not proven for current headdb8919ccbd554d2bb8ae09600ac7f538b7373b5a. -
Head SHA:
db8919ccbd554d2bb8ae09600ac7f538b7373b5a -
Workflow run: 28579688332
-
Workflow attempt: 1
Coverage evidence
Coverage Evidence
- Head SHA:
db8919ccbd554d2bb8ae09600ac7f538b7373b5a - Required test evidence: supported repository test suites must pass.
- Required docstring evidence: repository-owned docstring gates must pass when configured; otherwise docstring coverage is advisory.
Python project dependencies (services/analysis-engine)
Using CPython 3.12.3 interpreter at: /usr/bin/python3.12
Creating virtual environment at: services/analysis-engine/.venv
Resolved 49 packages in 0.59ms
Building bandscope-analysis @ file:///home/runner/work/bandscope/bandscope/pr-head/services/analysis-engine
Downloading pygments (1.2MiB)
Downloading scipy (33.6MiB)
Downloading scikit-learn (8.5MiB)
Downloading yt-dlp (3.0MiB)
Downloading soundfile (1.3MiB)
Downloading ruff (10.7MiB)
Downloading mypy (13.0MiB)
Downloading numpy (15.8MiB)
Downloading llvmlite (53.7MiB)
Downloading numba (3.6MiB)
Downloaded soundfile
Downloaded pygments
Built bandscope-analysis @ file:///home/runner/work/bandscope/bandscope/pr-head/services/analysis-engine
Downloaded numba
Downloaded ruff
Downloaded yt-dlp
Downloaded scikit-learn
Downloaded numpy
Downloaded llvmlite
Downloaded scipy
Downloaded mypy
Prepared 44 packages in 2.17s
Installed 44 packages in 63ms
+ audioread==3.1.0
+ bandit==1.9.4
+ bandscope-analysis==0.1.0 (from file:///home/runner/work/bandscope/bandscope/pr-head/services/analysis-engine)
+ certifi==2026.2.25
+ cffi==2.0.0
+ charset-normalizer==3.4.6
+ coverage==7.13.4
+ decorator==5.2.1
+ idna==3.18
+ iniconfig==2.3.0
+ joblib==1.5.3
+ lazy-loader==0.5
+ librosa==0.11.0
+ librt==0.8.1
+ llvmlite==0.45.1
+ markdown-it-py==4.0.0
+ mdurl==0.1.2
+ msgpack==1.2.1
+ mypy==1.19.1
+ mypy-extensions==1.1.0
+ numba==0.62.1
+ numpy==2.3.5
+ packaging==26.0
+ pathspec==1.0.4
+ platformdirs==4.9.4
+ pluggy==1.6.0
+ pooch==1.9.0
+ pycparser==3.0
+ pygments==2.20.0
+ pytest==9.0.3
+ pytest-cov==7.0.0
+ pyyaml==6.0.3
+ requests==2.33.0
+ rich==15.0.0
+ ruff==0.15.5
+ scikit-learn==1.8.0
+ scipy==1.17.1
+ soundfile==0.13.1
+ soxr==1.0.0
+ stevedore==5.7.0
+ threadpoolctl==3.6.0
+ typing-extensions==4.15.0
+ urllib3==2.7.0
+ yt-dlp==2026.6.9
- Result: PASS
Python coverage with missing-line report (services/analysis-engine)
============================= test session starts ==============================
platform linux -- Python 3.12.3, pytest-9.0.3, pluggy-1.6.0
rootdir: /home/runner/work/bandscope/bandscope/pr-head/services/analysis-engine
configfile: pyproject.toml
plugins: cov-7.0.0
collected 441 items
tests/test_activity.py ........ [ 1%]
tests/test_anchors.py .... [ 2%]
tests/test_api.py ......................... [ 8%]
tests/test_chord_recognizer.py .................... [ 12%]
tests/test_chords.py ......................... [ 18%]
tests/test_cli.py ................. [ 22%]
tests/test_extractor.py ...... [ 23%]
tests/test_health.py . [ 24%]
tests/test_pipeline_integration.py ......... [ 26%]
tests/test_pitch_tracker.py ............... [ 29%]
tests/test_priority.py ....... [ 31%]
tests/test_ranges.py ................... [ 35%]
tests/test_release_asset_selection.py ........ [ 37%]
tests/test_release_metadata.py ....... [ 38%]
tests/test_release_packaging.py ......... [ 40%]
tests/test_roles.py ....... [ 42%]
tests/test_roles_ml.py ... [ 43%]
tests/test_segmenter.py ..................... [ 47%]
tests/test_separation.py .................................. [ 55%]
tests/test_supply_chain_policy.py ...................................... [ 64%]
........................................................................ [ 80%]
......................................F.............. [ 92%]
tests/test_temporal.py ......... [ 94%]
tests/test_transcription.py ... [ 95%]
tests/test_tuning.py ..... [ 96%]
tests/test_youtube.py ................ [100%]
=================================== FAILURES ===================================
___________ test_pr_review_merge_scheduler_uses_github_actions_token ___________
def test_pr_review_merge_scheduler_uses_github_actions_token() -> None:
"""Ensure mechanical PR queue handling uses the workflow token, not the review app token."""
repo_root = Path(__file__).resolve().parents[3]
workflow = (repo_root / ".github" / "workflows" / "pr-review-merge-scheduler.yml").read_text(
encoding="utf-8"
)
> assert "contents: write" in workflow
E assert 'contents: write' in 'name: PR Review Merge Scheduler\n\non:\n schedule:\n - cron: "17 */2 * * *"\n workflow_dispatch:\n inputs:\n ... args+=(--no-update-branches)\n fi\n python3 scripts/ci/pr_review_merge_scheduler.py "${args[@]}"\n'
tests/test_supply_chain_policy.py:4979: AssertionError
=============================== warnings summary ===============================
tests/test_pipeline_integration.py::test_pipeline_without_detected_sections_falls_back
tests/test_roles.py::test_role_extractor_falls_back_when_activity_detection_fails
/home/runner/work/bandscope/bandscope/pr-head/services/analysis-engine/.venv/lib/python3.12/site-packages/librosa/core/pitch.py:103: UserWarning: Trying to estimate tuning from empty frequency set.
return pitch_tuning(
tests/test_roles.py::test_role_extractor_falls_back_when_activity_detection_fails
/home/runner/work/bandscope/bandscope/pr-head/services/analysis-engine/.venv/lib/python3.12/site-packages/librosa/core/spectrum.py:266: UserWarning: n_fft=2048 is too large for input signal of length=100
warnings.warn(
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/test_supply_chain_policy.py::test_pr_review_merge_scheduler_uses_github_actions_token - assert 'contents: write' in 'name: PR Review Merge Scheduler\n\non:\n schedule:\n - cron: "17 */2 * * *"\n workflow_dispatch:\n inputs:\n ... args+=(--no-update-branches)\n fi\n python3 scripts/ci/pr_review_merge_scheduler.py "${args[@]}"\n'
============= 1 failed, 440 passed, 3 warnings in 90.17s (0:01:30) =============
- Result: FAIL (exit 1)
Python docstring coverage
- Result: DEFERRED
- Reason: package.json defines check:python-docstrings; repository-owned docstring coverage runs after package dependency setup.
JavaScript/TypeScript dependencies (npm ci)
added 272 packages, and audited 275 packages in 8s
71 packages are looking for funding
run `npm fund` for details
found 0 vulnerabilities
- Result: PASS
Repository docstring coverage
> bandscope@0.1.3 check:python-docstrings
> sh -c 'cd services/analysis-engine && uv run ruff check src tests ../../scripts --select D100,D101,D102,D103,D104,D105,D106,D107'
All checks passed!
- Result: PASS
JavaScript/TypeScript test coverage
> bandscope@0.1.3 test
> npm run test --workspaces --if-present && sh -c 'cd services/analysis-engine && uv run pytest tests --cov=src/bandscope_analysis --cov-report=term-missing --cov-fail-under=100' --coverage
> @bandscope/desktop@0.1.0 test
> node -e "require('node:fs').mkdirSync('coverage/.tmp', { recursive: true })" && vitest run --coverage
�[1m�[30m�[46m RUN �[49m�[39m�[22m �[36mv4.1.9 �[39m�[90m/home/runner/work/bandscope/bandscope/pr-head/apps/desktop�[39m
�[2mCoverage enabled with �[22m�[33mv8�[39m
�[32m✓�[39m src/lib/export.test.ts �[2m(�[22m�[2m16 tests�[22m�[2m)�[22m�[32m 27�[2mms�[22m�[39m
�[32m✓�[39m src/lib/analysis.test.ts �[2m(�[22m�[2m14 tests�[22m�[2m)�[22m�[32m 19�[2mms�[22m�[39m
�[32m✓�[39m src/features/workspace/Workspace.test.tsx �[2m(�[22m�[2m11 tests�[22m�[2m)�[22m�[33m 1904�[2mms�[22m�[39m
�[33m�[2m✓�[22m�[39m enables bass transcription from selected role metadata rather than role id text �[33m 476�[2mms�[22m�[39m
�[33m�[2m✓�[22m�[39m renders collaboration summaries and role-specific rehearsal planning details �[33m 304�[2mms�[22m�[39m
�[32m✓�[39m src/components/ui/ui-primitives.test.tsx �[2m(�[22m�[2m7 tests�[22m�[2m)�[22m�[32m 230�[2mms�[22m�[39m
�[32m✓�[39m src/features/workspace/RoleSwitcher.test.tsx �[2m(�[22m�[2m4 tests�[22m�[2m)�[22m�[33m 438�[2mms�[22m�[39m
�[33m�[2m✓�[22m�[39m renders the title and role options �[33m 336�[2mms�[22m�[39m
�[32m✓�[39m src/i18n/index.test.ts �[2m(�[22m�[2m9 tests�[22m�[2m)�[22m�[32m 9�[2mms�[22m�[39m
�[90mstderr�[2m | src/App.test.tsx�[2m > �[22m�[2mApp�[2m > �[22m�[2mapplies pushed analysis status updates over the IPC event bridge
�[22m�[39mAn update to App inside a test was not wrapped in act(...).
When testing, code that causes React state updates should be wrapped into act(...):
act(() => {
/* fire events that update state */
});
/* assert on the output */
This ensures that you're testing the behavior the user would see in the browser. Learn more at https://react.dev/link/wrap-tests-with-act
�[90mstderr�[2m | src/App.test.tsx�[2m > �[22m�[2mApp�[2m > �[22m�[2mapplies pushed analysis status updates over the IPC event bridge
�[22m�[39mAn update to App inside a test was not wrapped in act(...).
When testing, code that causes React state updates should be wrapped into act(...):
act(() => {
/* fire events that update state */
});
/* assert on the output */
This ensures that you're testing the behavior the user would see in the browser. Learn more at https://react.dev/link/wrap-tests-with-act
An update to App inside a test was not wrapped in act(...).
When testing, code that causes React state updates should be wrapped into act(...):
act(() => {
/* fire events that update state */
});
/* assert on the output */
## Changed-File Evidence Map
```mermaid
flowchart LR
PR["PR changed files"] --> Evidence["OpenCode bounded evidence"]
Evidence --> S1["Changed file (21 files)"]
S1 --> I1["repository behavior"]
I1 --> R1["Review risk: Changed file (21 files)"]
R1 --> V1["required checks"]
Evidence --> S2["Workflow (2 files)"]
S2 --> I2["GitHub Actions review job"]
I2 --> R2["Review risk: Workflow (2 files)"]
R2 --> V2["actionlint plus required checks"]
Evidence --> S3["Docs (5 files)"]
S3 --> I3["operator or user guidance"]
I3 --> R3["Review risk: Docs (5 files)"]
R3 --> V3["docs review"]
Evidence --> S4["CI script (8 files)"]
S4 --> I4["review and security gate shell path"]
I4 --> R4["Review risk: CI script (8 files)"]
R4 --> V4["bash -n plus Strix self-test"]
Evidence --> S5["Test (4 files)"]
S5 --> I5["regression suite"]
I5 --> R5["Review risk: Test (4 files)"]
R5 --> V5["targeted test run"]
🚨 Severity: HIGH
💡 Vulnerability: Raw Python exceptions inside background workers were being stringified (
str(error)) and sent directly to IPC payloads/API responses.🎯 Impact: Attackers or users could see internal system states, memory limits, or sensitive local file paths exposed via exception messages.
🔧 Fix: Switched from blindly passing
str(error)to returning safe, static strings. Added internal logging withexc_info=Trueso that operational visibility is not lost, ensuring that the actual cause of the error is recorded securely on the server.✅ Verification: Ran unit tests to verify that
fake_queuereceives generic strings for all expected failure modes, and ensuring no path or process details are present in the error payload.PR created automatically by Jules for task 176780846299095447 started by @seonghobae