Skip to content

fix: improve E2E test stability with proper element waits and video recording#37

Merged
raven-wing merged 12 commits into
Problematy:mainfrom
raven-wing:fix_workaround
Jan 21, 2026
Merged

fix: improve E2E test stability with proper element waits and video recording#37
raven-wing merged 12 commits into
Problematy:mainfrom
raven-wing:fix_workaround

Conversation

@raven-wing

@raven-wing raven-wing commented Jan 17, 2026

Copy link
Copy Markdown
Contributor

Summary by CodeRabbit

  • Tests
    • Added comprehensive left-panel tests (desktop/tablet/mobile) and improved mobile interaction reliability by waiting for readiness before taps/clicks.
  • Chores
    • E2E tests now use the minified frontend and per-test video recording is enabled with retain-on-failure.
  • CI
    • Stabilized frontend startup with a short delay and added automatic upload of failure videos for easier debugging.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai

coderabbitai Bot commented Jan 17, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

Switches frontend test assets to minified bundles, refines Playwright mobile contexts to record per-test videos with conditional retention, adds a frontend startup delay and failure-video artifact upload in CI, replaces force-clicks with wait-then-click in mobile tests, and adds a comprehensive left-panel test suite across viewports.

Changes

Cohort / File(s) Summary
CI workflow & artifact upload
​.github/workflows/e2e-tests.yml
Start frontend with make serve-prod, add a 3s sleep after server start, and add a step to upload e2e-tests/test-results/**/*.webm (ignore if none) as test-failure-videos with 7-day retention, placed before stopping the backend.
Frontend bundle URL updates
Config files: tests/conftest.py, e2e_stress_test_config.yml, e2e_test_config.yml
Switched served frontend script references from index.js to index.min.js (WEBPACK_SCRIPT_URL / GOODMAP_FRONTEND_LIB_URL). Removed the webpack overlay removal script constant and its injection.
Playwright video & test-result plumbing
tests/conftest.py, pyproject.toml
Added pytest_runtest_makereport hook; mobile_page fixture now supports indirect parametrization and configures per-test record_video_dir and record_video_size when --video is enabled; captures video path on teardown and conditionally deletes videos when --video=retain-on-failure. Pytest ini options set video output to test-results.
Mobile interaction adjustments
tests/basic/test_location_buttons.py
Replaced force=True clicks with wait_for(state="visible") then regular click(), removed forced body clicks, and switched tests to accept/use indirect mobile_page parametrization.
New left-panel E2E tests
tests/basic/test_left_panel.py
Added extensive left-panel tests covering desktop, tablet, and mobile behaviors, scrollbar and tooltip checks, and multiple viewport/device parametrizations.

Sequence Diagram(s)

sequenceDiagram
    participant CI as CI runner
    participant Frontend as Frontend server
    participant Playwright as Playwright runner
    participant Browser as Browser/device
    participant Artifacts as Artifact storage

    CI->>Frontend: run `make serve-prod`
    CI->>CI: sleep 3s
    CI->>Playwright: start tests (uses index.min.js)
    Playwright->>Browser: create context (mobile? include record_video_dir & size)
    Browser->>Playwright: execute tests (recording per-test)
    alt test failed
        Playwright->>CI: emit failure report (pytest_runtest_makereport)
        CI->>Artifacts: upload `e2e-tests/test-results/**/*.webm`
    else test passed
        Playwright->>Playwright: delete per-test video if retain-on-failure
    end
    CI->>Frontend: stop server
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐰 I hopped in with a headset, sleepy and spry,
Minified scripts hum beneath the sky,
I wait, I tap, I record every run,
Keep the vids when a test has no fun,
Left panel checked — a carrot well won. 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately describes the main changes: replacing force=True with proper wait-based interactions for element stability, adding video recording for test diagnostics, and using indirect parametrization for improved test structure.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

Copy link
Copy Markdown

🧪 E2E Test Results

📊 View full workflow run
🔗 Commit: 4087b37

⚠️ E2E Stress Test Results

Performance data not found. See workflow logs for details.

@github-actions

Copy link
Copy Markdown

🧪 E2E Test Results

📊 View full workflow run
🔗 Commit: c3e3823

⚠️ E2E Stress Test Results

Performance data not found. See workflow logs for details.

@github-actions

Copy link
Copy Markdown

🧪 E2E Test Results

📊 View full workflow run
🔗 Commit: 182d1ee

⚠️ E2E Stress Test Results

Performance data not found. See workflow logs for details.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@tests/basic/test_left_panel.py`:
- Around line 60-78: The test_no_page_scrollbar_on_desktop currently uses
document.body.scrollHeight > document.body.clientHeight which can be true even
with overflow:hidden; change the assertion to attempt an actual scroll and
verify it fails: in the test_no_page_scrollbar_on_desktop function, after
waiting for "#filter-form", run a page.evaluate that records initial scroll
position (window.scrollY or document.documentElement.scrollTop), attempts to
scroll (e.g. window.scrollTo(0, 100) or set document.documentElement.scrollTop),
then read the scroll position again and assert it did not change; this ensures
the page cannot be scrolled rather than relying on scrollHeight comparisons.
- Line 145: The unused parameter device_name in test functions (e.g.,
test_panel_hidden_by_default_on_mobile) is triggering ARG002; at the top of each
affected test body (lines indicated: tests named
test_panel_hidden_by_default_on_mobile and the other parametrized tests at the
given positions) add a no-op assignment like _ = device_name  # required by
parametrization; intentionally unused to silence the lint warning; update the
same pattern in all listed tests (lines 145, 162, 182, 200, 226, 256) so the
pytest parametrization remains intact while ARG002 is resolved.
🧹 Nitpick comments (1)
tests/basic/test_left_panel.py (1)

80-114: Ensure the scroll test actually exercises overflow.

If content no longer overflows, this test passes without validating scrolling. Add an explicit assertion so failures are actionable.

♻️ Suggested tweak
         scroll_info = panel_body.evaluate(
             """el => ({
                 clientHeight: el.clientHeight,
                 scrollHeight: el.scrollHeight,
                 hasOverflow: el.scrollHeight > el.clientHeight
             })"""
         )
 
-        # If there's overflow, verify scrolling works
-        if scroll_info["hasOverflow"]:
-            # Get initial scroll position
-            initial_scroll = panel_body.evaluate("el => el.scrollTop")
-
-            # Scroll down
-            panel_body.evaluate("el => el.scrollBy(0, 200)")
-
-            # Verify scroll position changed
-            new_scroll = panel_body.evaluate("el => el.scrollTop")
-            assert new_scroll > initial_scroll, "Panel body should be scrollable"
+        assert scroll_info["hasOverflow"], (
+            "Expected panel content to overflow for the scroll test; "
+            "reduce the viewport height or add fixtures if needed."
+        )
+
+        # Verify scrolling works
+        initial_scroll = panel_body.evaluate("el => el.scrollTop")
+        panel_body.evaluate("el => el.scrollBy(0, 200)")
+        new_scroll = panel_body.evaluate("el => el.scrollTop")
+        assert new_scroll > initial_scroll, "Panel body should be scrollable"

Comment thread tests/basic/test_left_panel.py Outdated
Comment thread tests/basic/test_left_panel.py Outdated
@github-actions

Copy link
Copy Markdown

🧪 E2E Test Results

📊 View full workflow run
🔗 Commit: f464a2f

⚠️ E2E Stress Test Results

Performance data not found. See workflow logs for details.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@tests/basic/test_location_buttons.py`:
- Around line 248-260: The 200ms timeouts used for tooltip visibility checks are
too short and inconsistent with the rest of the file; update the two expect(...)
calls that use timeout=200 (the visibility and not_to_be_visible assertions
referencing the tooltip locator in tests/basic/test_location_buttons.py using
mobile_page and tooltip) to use a larger timeout (at minimum 1000ms, preferably
5000ms to match surrounding patterns and CI reliability).

In `@tests/conftest.py`:
- Around line 276-294: Replace request.node.name with request.node.nodeid when
constructing the test-specific video directory to avoid name collisions; update
the assignment that builds test_name (currently using request.node.name and
replace("[", "-").replace("]", "")) to use request.node.nodeid and sanitize path
separators and parameter brackets (e.g., replace path separators and brackets)
before creating record_video_dir in the context creation block so
record_video_dir remains unique per test run.
🧹 Nitpick comments (1)
tests/basic/test_location_buttons.py (1)

220-232: Gate the 2s debug pause behind an environment variable to avoid slowing every mobile test run.

The unconditional wait_for_timeout(2000) will add noticeable runtime across all mobile tests. Since the code is explicitly marked for removal after debugging, it should be conditional. The fixture pattern already uses configuration-based conditionals (e.g., video recording), so the approach is consistent with the codebase.

🛠️ Proposed adjustment
+import os
 import pytest
 from playwright.sync_api import Page, expect
@@
-        # DEBUG: Keep tooltip visible for video capture - remove after debugging
-        mobile_page.wait_for_timeout(2000)
+        # DEBUG: Keep tooltip visible for video capture - remove after debugging
+        if os.getenv("DEBUG_VIDEO_PAUSE") == "1":
+            mobile_page.wait_for_timeout(2000)

Comment thread tests/basic/test_location_buttons.py Outdated
Comment thread tests/conftest.py
@github-actions

Copy link
Copy Markdown

🧪 E2E Test Results

📊 View full workflow run
🔗 Commit: dc0e80b

⚠️ E2E Stress Test Results

Performance data not found. See workflow logs for details.

@github-actions

Copy link
Copy Markdown

🧪 E2E Test Results

📊 View full workflow run
🔗 Commit: 5df55c4

⚠️ E2E Stress Test Results

Performance data not found. See workflow logs for details.

@raven-wing raven-wing changed the title fix: make this work without hacks fix: improve E2E test stability with proper element waits and video recording Jan 21, 2026
@github-actions

Copy link
Copy Markdown

🧪 E2E Test Results

📊 View full workflow run
🔗 Commit: e4c528e

⚠️ E2E Stress Test Results

Performance data not found. See workflow logs for details.

@github-actions

Copy link
Copy Markdown

🧪 E2E Test Results

📊 View full workflow run
🔗 Commit: 0086727

📊 E2E Stress Test Performance

Status: PASSED (11732.11ms max < 25000ms limit)

Metric Value
Average Time 10418.87ms
Minimum Time 9727.67ms
Maximum Time 11732.11ms
Completed Runs 5/5
Avg Markers Loaded 72
📈 Individual Run Times
Run Time (ms) Markers
Run 1 9727.67ms 72
Run 2 9838.75ms 72
Run 3 11005.65ms 72
Run 4 11732.11ms 72
Run 5 9790.17ms 72

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@tests/conftest.py`:
- Around line 123-132: The pytest hook function pytest_runtest_makereport
currently declares an unused parameter call causing Ruff ARG001; rename the
parameter to _call (i.e., def pytest_runtest_makereport(item, _call):) so
linters recognize it as intentionally unused, or alternatively add a targeted
noqa comment for ARG001 on the function signature—update the function
declaration and any internal references to the parameter name accordingly.
- Around line 307-316: The retain-on-failure logic only checks
request.node.rep_call for failures and should instead consider all test phases;
update the block that inspects video_path/video_option to determine test_failed
by checking request.node.rep_setup, request.node.rep_call, and
request.node.rep_teardown (each may or may not exist) and treat the test as
failed if any of those report .failed; keep the existing deletion/unlink and
empty-directory removal behavior but skip deletion when any phase failed so
videos are preserved.
♻️ Duplicate comments (1)
tests/conftest.py (1)

255-266: Avoid video dir collisions by using nodeid (not name).

request.node.name can collide across parametrized or same-named tests, which risks overwriting videos.

🛠️ Proposed fix
-        test_name = request.node.name.replace("[", "-").replace("]", "")
-        record_video_dir = str(Path(output_dir) / test_name)
+        nodeid = request.node.nodeid
+        safe_nodeid = (
+            nodeid.replace("::", "__")
+            .replace("/", "__")
+            .replace("[", "-")
+            .replace("]", "")
+        )
+        record_video_dir = str(Path(output_dir) / safe_nodeid)

Also applies to: 272-273

Comment thread tests/conftest.py
Comment thread tests/conftest.py
@github-actions

Copy link
Copy Markdown

🧪 E2E Test Results

📊 View full workflow run
🔗 Commit: 0086727

📊 E2E Stress Test Performance

Status: PASSED (10816.12ms max < 25000ms limit)

Metric Value
Average Time 10143.11ms
Minimum Time 9733.51ms
Maximum Time 10816.12ms
Completed Runs 5/5
Avg Markers Loaded 67
📈 Individual Run Times
Run Time (ms) Markers
Run 1 10021.49ms 67
Run 2 9733.51ms 67
Run 3 10383.69ms 67
Run 4 10816.12ms 67
Run 5 9760.74ms 67

@github-actions

Copy link
Copy Markdown

🧪 E2E Test Results

📊 View full workflow run
🔗 Commit: 0086727

📊 E2E Stress Test Performance

Status: PASSED (11138.4ms max < 25000ms limit)

Metric Value
Average Time 10337.24ms
Minimum Time 9929.39ms
Maximum Time 11138.4ms
Completed Runs 5/5
Avg Markers Loaded 74
📈 Individual Run Times
Run Time (ms) Markers
Run 1 10019.32ms 74
Run 2 9929.39ms 74
Run 3 10370.08ms 74
Run 4 11138.4ms 74
Run 5 10229.02ms 74

@github-actions

Copy link
Copy Markdown

🧪 E2E Test Results

📊 View full workflow run
🔗 Commit: a7d1462

⚠️ E2E Stress Test Results

Performance data not found. See workflow logs for details.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@tests/basic/test_left_panel.py`:
- Around line 308-309: The test clicks the toggle button immediately which can
be flaky; before calling toggle_button.click() in
tests/basic/test_left_panel.py, wait for the element to be visible/attached
using toggle_button.wait_for(...) (e.g., state="visible" or "attached") to
mirror the mobile tests pattern, then perform toggle_button.click(); update the
code around the toggle_button locator and click to include this wait to ensure
stability.
- Around line 236-237: The test uses
mobile_page.locator('button[aria-label="Toggle left panel"]') assigned to
toggle_button and calls toggle_button.click() directly; add a wait for
visibility to match other tests by calling
toggle_button.wait_for(state="visible") before toggle_button.click() so the
button is guaranteed ready and flakiness is avoided.
- Around line 266-267: The click on the left-panel toggle is missing an explicit
wait which can cause flakiness; before calling toggle_button.click() (where
toggle_button is created via mobile_page.locator('button[aria-label="Toggle left
panel"]')), wait for the locator to be visible/enabled using the
locator.wait_for(...) API (or mobile_page.wait_for_selector with the same
selector), then perform the click to ensure stability.
🧹 Nitpick comments (1)
tests/basic/test_left_panel.py (1)

370-375: Test assertion may always pass in Chromium — consider browser-specific handling.

Allowing empty string means this test passes even when no custom scrollbar styling is applied in Chromium. Since scrollbar-width is Firefox-only, you might consider:

  1. Skip this test on Chromium, or
  2. Add a separate Chromium-specific check using CSS selector inspection for ::-webkit-scrollbar rules

This is a minor observation — the current approach is documented and the test still provides value for Firefox.

Comment thread tests/basic/test_left_panel.py
Comment thread tests/basic/test_left_panel.py
Comment thread tests/basic/test_left_panel.py
@github-actions

Copy link
Copy Markdown

🧪 E2E Test Results

📊 View full workflow run
🔗 Commit: 4c88b6a

📊 E2E Stress Test Performance

Status: PASSED (13194.93ms max < 25000ms limit)

Metric Value
Average Time 12155.04ms
Minimum Time 11656.68ms
Maximum Time 13194.93ms
Completed Runs 5/5
Avg Markers Loaded 69
📈 Individual Run Times
Run Time (ms) Markers
Run 1 11656.68ms 69
Run 2 13194.93ms 69
Run 3 11857.16ms 69
Run 4 12205.58ms 69
Run 5 11860.86ms 69

@github-actions

Copy link
Copy Markdown

🧪 E2E Test Results

📊 View full workflow run
🔗 Commit: 83d49e7

📊 E2E Stress Test Performance

Status: PASSED (11211.12ms max < 25000ms limit)

Metric Value
Average Time 10569.96ms
Minimum Time 9853.77ms
Maximum Time 11211.12ms
Completed Runs 5/5
Avg Markers Loaded 74
📈 Individual Run Times
Run Time (ms) Markers
Run 1 9853.77ms 74
Run 2 10531.86ms 74
Run 3 10169.21ms 74
Run 4 11083.86ms 74
Run 5 11211.12ms 74

@raven-wing raven-wing merged commit 6bafe2b into Problematy:main Jan 21, 2026
2 checks passed
@raven-wing raven-wing deleted the fix_workaround branch January 21, 2026 12:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant