Skip to content

chore(ci): unblock CI by gating bash tests on tree-sitter; sort review prompt glob#617

Open
elfensky wants to merge 1 commit into
peteromallet:mainfrom
elfensky:chore/ci-unblock-pre-existing-failures
Open

chore(ci): unblock CI by gating bash tests on tree-sitter; sort review prompt glob#617
elfensky wants to merge 1 commit into
peteromallet:mainfrom
elfensky:chore/ci-unblock-pre-existing-failures

Conversation

@elfensky
Copy link
Copy Markdown
Contributor

@elfensky elfensky commented Jun 1, 2026

Summary

CI has been red on main since 2026-04-06, and on every PR opened since, because of 5 pre-existing test failures unrelated to any feature work. Because they live on main, every PR inherits them and can't go green without bundling a workaround (see #614, #616). This standalone, test-only PR fixes the root cause so main gets a green baseline.

Two distinct root causes:

1. Bash tests assume tree-sitter is installed

desloppify/tests/lang/common/test_bash_unused_imports.py calls detect_unused_imports, which parses via tree-sitter. tree-sitter is an optional dependency ([project.optional-dependencies]), so the tests-core job (no extras) doesn't have it — there detect_unused_imports returns [] and the tests, asserting specific findings, fail. In tests-full (pip install -e ".[full]") tree-sitter is present and the tests pass.

Fix: gate the file with the exact skipif pattern its sibling tree-sitter tests already use (test_treesitter.py, test_treesitter_complexity_and_integration.py):

pytestmark = pytest.mark.skipif(
    not is_available(), reason="tree-sitter-language-pack not installed"
)

This is a consistency fix — the bash file was the only tree-sitter-dependent test file missing the guard. Coverage is preserved: with tree-sitter (tests-full) the tests run and assert real findings; without it (tests-core) they skip cleanly, because the feature genuinely cannot run there.

2. Review tests rely on filesystem-dependent glob order

test_review_commands.py (and its integration counterpart) use list(runs_dir.glob("*/prompts/batch-*.md")) unsorted. macOS happens to return batch-1.md first; Linux returns batch-2.md first. The assertion "Previously flagged issues" in prompt_text only holds for batch-1.md, so it fails on Linux.

Fix: sorted(...) the glob. Deterministic across OSes, removes a latent flake that only passed on macOS by luck, and the assertion still verifies exactly what it intends (batch-1's content).

Why this is safe

  • Test-only. No product/runtime behaviour changes.
  • No tests made meaningless. Fix 1 preserves full bash coverage in tests-full and only skips where the optional feature is absent; the graceful-degradation path is itself covered by test_treesitter.py::test_is_available_false_when_uninstalled. Fix 2 strengthens (de-flakes) its test.

Verification

Reproduced both CI environments locally:

  • tree-sitter present (mirrors tests-full): the 3 bash tests run and pass, both review tests pass.
  • tree-sitter absent (mirrors tests-core): the bash tests skip with reason tree-sitter-language-pack not installed; review fix is OS-independent.

🤖 Generated with Claude Code

…w prompt glob

## The problem

CI has been red on `main` since 2026-04-06, and on every PR opened
since, because of 5 pre-existing test failures that are unrelated to
any feature work. Because they live on `main`, every PR inherits them
and cannot go green without bundling a workaround. Two distinct root
causes:

  1. Three bash tests in
     `desloppify/tests/lang/common/test_bash_unused_imports.py` call
     `detect_unused_imports`, which parses via tree-sitter. tree-sitter
     is an *optional* dependency (`[project.optional-dependencies]`),
     so the `tests-core` job (which installs no extras) does not have
     it; there the function returns `[]` and the tests, asserting
     specific findings, fail. In `tests-full` (installs `.[full]`)
     tree-sitter is present and the tests pass.

  2. Two review tests in
     `desloppify/tests/review/test_review_commands.py` and its
     integration counterpart use
     `list(runs_dir.glob("*/prompts/batch-*.md"))` without sorting.
     The glob order is filesystem-dependent: on macOS it happens to
     return `batch-1.md` (the batch with `historical_issue_focus`)
     first, on Linux it returns `batch-2.md` first. The assertion
     `"Previously flagged issues" in prompt_text` then fails on Linux
     because that string only appears in `batch-1.md`.

## The solution

Two small, behaviour-preserving test fixes:

  1. Gate the bash test file on tree-sitter availability with the exact
     pattern its sibling tree-sitter tests already use
     (`test_treesitter.py`, `test_treesitter_complexity_and_integration.py`):

     ```python
     pytestmark = pytest.mark.skipif(
         not is_available(),
         reason="tree-sitter-language-pack not installed",
     )
     ```

     This is a consistency fix — the bash file was the only
     tree-sitter-dependent test file missing the guard. Coverage is
     preserved: with tree-sitter installed (`tests-full`) the tests
     run and assert real findings as before; without it (`tests-core`)
     they skip cleanly, because the feature genuinely cannot run.

  2. Wrap the review test's glob in `sorted(...)`:

     ```python
     prompt_files = sorted(runs_dir.glob("*/prompts/batch-*.md"))
     ```

     This makes the test deterministic across operating systems —
     `batch-1.md` always sorts first — and removes a latent flake that
     only passed on macOS by luck. The assertion still verifies exactly
     what it intends (batch-1's content).

## Scope

This is a standalone maintenance PR against `main`: it touches only test
files and changes no product behaviour. Landing it here gives `main` —
and therefore every open and future PR (peteromallet#614, peteromallet#616, …) — a green
baseline, instead of each PR having to carry the same workaround.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant