test(mock-llm): strengthen E2E coverage for conversation and automation flows by malhotra5 · Pull Request #940 · OpenHands/agent-canvas

malhotra5 · 2026-05-29T16:58:10Z

A human has tested these changes.

Why

Issue #511 identified gaps in mock-LLM E2E test coverage. The existing tests exercised happy-path agent output but did not verify several aspects users depend on: sidebar presence, user message rendering, payload correctness (worktree: true), settings API consistency after profile activation, conversation resume, or automation detail page UI elements.

Summary

Conversation test: add 4 new test.step checks in existing steps (settings API model verification, worktree payload interception, user message visibility, sidebar link) and 1 entirely new step (resume conversation from sidebar after navigating away).
Automation test: add 2 new test.step checks (active status badge and cron schedule display on the detail page).
Adjust conversation test cleanup to preserve the step 3 conversation for the step 4 resume test, with afterAll safety-net cleanup.

Issue Number

#511

How to Test

Run the mock-LLM E2E tests:

npm run build:app  # if build/ is missing
npm run test:e2e:mock-llm

All 7 tests (4 conversation steps + 3 automation steps) should pass. The new checks are test.step blocks within existing serial tests, so they run as part of the existing flow.

Type

This PR was created by an AI agent (OpenHands) on behalf of the user.

@malhotra5 can click here to continue refining the PR

🐳 Docker images for this PR

• GHCR package: https://github.com/OpenHands/agent-canvas/pkgs/container/agent-canvas

Component	Value
Image	`ghcr.io/openhands/agent-canvas`
Architectures	amd64, arm64
Agent Server	`ghcr.io/openhands/agent-server:1.24.0-python`
Automation	`openhands-automation==1.0.0a5`
Commit	`a6a2c5c19a2486d858cd440bb956c085b339e4b3`

Pull (multi-arch manifest)

# Multi-arch manifest — Docker automatically pulls the correct architecture
docker pull ghcr.io/openhands/agent-canvas:sha-a6a2c5c

Run

docker run -it --rm \
  -p 8000:8000 \
  ghcr.io/openhands/agent-canvas:sha-a6a2c5c

All tags pushed for this build

ghcr.io/openhands/agent-canvas:sha-a6a2c5c-amd64
ghcr.io/openhands/agent-canvas:test-mock-llm-coverage-511-amd64
ghcr.io/openhands/agent-canvas:pr-940-amd64
ghcr.io/openhands/agent-canvas:sha-a6a2c5c-arm64
ghcr.io/openhands/agent-canvas:test-mock-llm-coverage-511-arm64
ghcr.io/openhands/agent-canvas:pr-940-arm64
ghcr.io/openhands/agent-canvas:sha-a6a2c5c
ghcr.io/openhands/agent-canvas:test-mock-llm-coverage-511
ghcr.io/openhands/agent-canvas:pr-940

About Multi-Architecture Support

Each tag (e.g., sha-a6a2c5c) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., sha-a6a2c5c-amd64) are also available if needed

…on flows Conversation test additions (mock-llm-conversation.spec.ts): - Step 2: verify settings API reflects active profile's llm.model and base_url - Step 3: intercept POST /api/conversations and assert worktree:true in payload - Step 3: verify user message is visible in a user-message element - Step 3: verify conversation appears in sidebar with correct link - Step 4 (new): resume conversation from sidebar after navigating away, verify agent reply and user message are still visible Automation test additions (mock-llm-automation.spec.ts): - Step 3: verify active-status-badge-active is visible on detail page - Step 3: verify cron schedule (or human-readable equivalent) on detail page Addresses coverage gaps from issue #511 'I can' statements: - I can start the Agent Canvas (conversation in sidebar) - worktree flag in conversation creation payload - I can create conversations, list and resume them - Activate profile -> settings API reflects model - I can run automations on a schedule (UI verification) Co-authored-by: openhands <openhands@all-hands.dev>

vercel · 2026-05-29T16:58:16Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agent-canvas	Ready	Preview, Comment	May 29, 2026 6:29pm

github-actions · 2026-05-29T17:02:34Z

✅ Mock-LLM E2E Tests

7/7 passed

Commit: 7347f569 · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	294ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	25.6s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	5.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.5s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.7s

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

all-hands-bot · 2026-05-29T17:03:56Z

✅ Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

github-actions · 2026-05-29T17:05:56Z

✅ Mock-LLM E2E Tests

7/7 passed

Commit: 9e542772 · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	238ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	25.6s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.7s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	5.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.6s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.7s

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

all-hands-bot

Code Review — `test(mock-llm): strengthen E2E coverage for conversation and automation flows`

Summary

This PR does a good job filling the gaps identified in #511. The new checks are well-structured, the afterAll safety net is a thoughtful addition, and the explanation of the LIFO conflict between page.on and page.route in the comments is exactly the kind of context future maintainers need. A few items below are worth addressing before merge.

Issues

1. Broad URL filter may capture unintended POST requests

req.url().includes("/api/conversations") will match any path containing that substring — including future paths like /api/conversations/{id}/messages or /api/conversations/{id}/events. Since only the conversation-creation POST is relevant here, tighten the match:

const url = new URL(req.url());
if (req.method() === "POST" && url.pathname === "/api/conversations") {

or, simpler:

req.url().endsWith("/api/conversations")

As-is, if the app ever issues a second POST to a /api/conversations/* sub-path before the user message is sent, capturedConversationPayload would be silently overwritten and the assertion could pass with stale data.

2. Hardcoded user message string in step 4

The user message "Please run a quick terminal command and then reply." is defined as userMessage in step 3 (line 288) but then hardcoded again in step 4 (line 450). In serial tests these two strings must stay in sync, but there is no compile-time guarantee.

Extract it to a module-level constant so a single change keeps both steps aligned:

// Near PROFILE_NAME, BASH_TOKEN, etc. at the top of the file
const USER_MESSAGE = "Please run a quick terminal command and then reply.";

Then reference USER_MESSAGE in both step 3 and step 4.

3. Brittle fallback strings in the cron schedule assertion

text.toLowerCase().includes("9:00") and text.toLowerCase().includes("every day") scan the entire document.body.innerText. Any clock widget, table cell, or debug output on the page containing "9:00" or "every day" would cause a false-positive pass. The first branch (text.includes(schedule) where schedule = CRON_SCHEDULE) already covers the raw-cron fallback; the second and third branches add fragility without much additional signal.

Prefer a scoped Playwright assertion instead:

// Use getByText if CRON_SCHEDULE is always rendered on the page
await expect(page.getByText(CRON_SCHEDULE)).toBeVisible({ timeout: 10_000 });

If the UI always renders a human-readable form and never the raw cron, consider querying the API for schedule_human and asserting against that value — either way, tie the assertion to the actual rendered element rather than a full-page text scan.

Nits / Non-blocking

The page.on("request", ...) listener is never removed via page.off(). In practice this is fine because Playwright disposes the page after each test, but for completeness a named handler variable with a corresponding page.off() call would be cleaner.
conversationIds.add(step3ConversationId!) in step 4 uses a non-null assertion after the expect(...).toBeTruthy() guard — that pattern is correct and readable.
The afterAll safety-net cleanup is a good practice; no issues there.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

1. Tighten URL filter: use exact pathname match (new URL(...).pathname === '/api/conversations') instead of broad .includes() to avoid capturing sub-path POSTs like /api/conversations/{id}/messages. 2. Extract hardcoded user message to module-level USER_MESSAGE constant so step 3 and step 4 stay in sync automatically. 3. Replace overly broad cron schedule assertion (full-page text scan with fallback strings) with a scoped page.getByText(CRON_SCHEDULE) check. Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot · 2026-05-29T17:13:30Z

✅ Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

github-actions · 2026-05-29T17:14:12Z

✅ Mock-LLM E2E Tests

7/7 passed

Commit: 447642f7 · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	273ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	24.6s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	6.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	5.4s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.7s

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

all-hands-bot

Code Review — `test(mock-llm): strengthen E2E coverage for conversation and automation flows`

Prior Review Resolution

All three issues raised in the previous review have been fully addressed in this head commit:

✅ URL filter tightened to an exact pathname === "/api/conversations" match — no longer captures sub-paths like /api/conversations/{id}/messages.
✅ User message string extracted to the module-level USER_MESSAGE constant (line 44), eliminating the silent drift risk between step 3 and step 4.
✅ Cron schedule assertion replaced with page.getByText(CRON_SCHEDULE).toBeVisible() — scoped element query instead of a full-page text scan.

The new checks integrate cleanly into the serial test flow and close the gaps identified in #511.

Remaining Issues

1. Misleading comment and redundant `add()` call at step 4 setup (`mock-llm-conversation.spec.ts:407–408`)

// Re-create the conversation ID tracking so afterEach can clean up
conversationIds.add(step3ConversationId!);

The comment is wrong on two counts:

The add() call is a no-op. step3ConversationId was already added to conversationIds in step 3 (line 315) and was intentionally preserved by afterEach’s if (id === step3ConversationId) continue guard. Adding it again to a Set that already contains it has no effect.
afterEach does not clean up step3ConversationId — it skips it. The afterAll safety net is what actually calls deleteConversation. The comment therefore misrepresents which hook performs the cleanup.

Suggested fix — remove both the line and the comment (the conversation is already tracked and afterAll handles cleanup), or replace the comment with an accurate description:

// step3ConversationId is already in conversationIds (added in step 3).
// afterEach skips it; afterAll handles the final cleanup.

2. `page.on("request", …)` listener never removed (`mock-llm-conversation.spec.ts:261`) — nit

The listener registered with page.on("request", ...) is not removed after the assertions complete. The prior review flagged this as non-blocking — Playwright disposes the page after each test so there is no functional risk. A named handler with a corresponding page.off() call would be cleaner but is not required before merge.

Summary

A well-structured PR that systematically closes the coverage gaps from #511. The serial test orchestration — shared step3ConversationId, skip-in-afterEach, afterAll cleanup — is logically correct, and the inline comments explaining design decisions (particularly the LIFO note for page.on vs page.route) are exactly the kind of context future maintainers need.

The misleading comment at step 4 (issue 1 above) should be corrected before merge; everything else is in good shape.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

- Remove no-op conversationIds.add(step3ConversationId) in step 4 — the ID is already tracked from step 3 and afterAll handles its cleanup. - Extract page.on('request') handler to a named function and call page.off() after the conversation URL is captured. Co-authored-by: openhands <openhands@all-hands.dev>

stale as per latest commit

all-hands-bot · 2026-05-29T17:29:16Z

✅ Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

github-actions · 2026-05-29T17:30:09Z

✅ Mock-LLM E2E Tests

7/7 passed

Commit: 97812ca3 · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	274ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	25.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.3s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	5.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.6s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.7s

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

all-hands-bot

Code Review — `test(mock-llm): strengthen E2E coverage for conversation and automation flows`

Summary

All three issues and both nits raised in the previous review have been addressed cleanly:

URL filter now uses exact pathname === "/api/conversations" ✅
USER_MESSAGE extracted to module-level constant ✅
Cron schedule assertion uses scoped page.getByText(CRON_SCHEDULE) ✅
Named captureConversationPayload handler with page.off() ✅
Redundant conversationIds.add(step3ConversationId) removed ✅

The step 2 settings-API verification is thorough (checks both model and base_url), the afterAll cleanup pattern is correct Playwright practice, and the LIFO-conflict explanation in the comment is exactly the right context for future maintainers. Two minor observations below.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

…ge check - Replace expect().toBeTruthy() with test.skip() so step 4 is skipped (not failed) when step 3 didn't complete. - Replace expect.poll + page.evaluate with Playwright-native locator().filter({ hasText }).toBeVisible() for the user message assertion in step 4. Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot · 2026-05-29T18:21:08Z

✅ Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

github-actions · 2026-05-29T18:21:09Z

✅ Mock-LLM E2E Tests

7/7 passed

Commit: d0d4ac4c · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	234ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	25.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.3s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	5.7s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	5.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	7.4s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.7s

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

all-hands-bot

Code Review — `test(mock-llm): strengthen E2E coverage for conversation and automation flows`

Summary

The PR is in good shape. All seven threads from previous rounds are resolved, and the current diff is clean:

Settings API verification (step 2): checks both model and base_url — thorough. ✅
Payload capture (captureConversationPayload): exact-pathname filter, named handler, page.off() after navigation — correct. ✅
worktree: true assertion: properly gates on the captured payload being non-null before asserting the field. ✅
User-message and sidebar checks: native Playwright locators (filter({ hasText }), getByTestId). ✅
Resume test (step 4): test.skip(!step3ConversationId) for graceful skip, then verifies reply token, user message, and no error banners. ✅
Cleanup lifecycle: afterEach skip guard + afterAll safety-net covers all exit paths. ✅
Automation spec: getByTestId("active-status-badge-active") and getByText(CRON_SCHEDULE) are well-scoped. ✅

One minor inline comment below (error message wording only — non-blocking).

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-05-29T18:28:25Z

⚠️ Mock-LLM E2E Tests

0/0 passed

Commit: 3660d150 · Workflow run

Status	Test	Duration

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

tofarr

🍰

github-actions · 2026-05-29T18:31:01Z

✅ Mock-LLM E2E Tests

7/7 passed

Commit: a6a2c5c1 · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	216ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	24.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	5.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.4s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.7s

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

github-actions · 2026-05-29T18:39:57Z

📸 Snapshot Test Report

Warning

Snapshot comparison step crashed (timeout, OOM, or runner error) — diff results below may be incomplete or absent.
Check the CI logs for the full error output (look for the "Run snapshot comparison" step).

Warning

One or more snapshot tests crashed during generation — some snapshots below may be incomplete.
Check the CI logs for the full error output (look for the "Generate current PR snapshots" step).

❌ 8 snapshots differ from the main branch baselines. Add the update-snapshots label to acknowledge intentional changes.

Category	Count
🔴 Changed	8
🆕 New	0
✅ Unchanged	65
Total	73

How to resolve:

Unintentional diffs — the baselines on main may have moved since this branch was created. Merge the latest main into this branch and re-run CI.

Intentional changes — add the update-snapshots label. CI will pass and the new screenshots become the baseline when this PR merges.

🔴 Changed snapshots (8)

`backends-extended` — 2 snapshots

backend-after-switch

Expected (main)	Actual (PR)	Diff

backend-manage-after-removal

Expected (main)	Actual (PR)	Diff

`mcp-page` — 5 snapshots

mcp-custom-server-1-editor-open

Expected (main)	Actual (PR)	Diff

mcp-custom-server-editor

Expected (main)	Actual (PR)	Diff

mcp-empty-installed

Expected (main)	Actual (PR)	Diff

mcp-search-filtered

Expected (main)	Actual (PR)	Diff

mcp-slack-install-1-marketplace

Expected (main)	Actual (PR)	Diff

`settings-page`

settings-page

Expected (main)	Actual (PR)	Diff

✅ Unchanged snapshots (65)

archived-conversation

conversation-panel-with-archived-badges
conversation-view-archived
conversation-view-sandbox-error

automations

automations-delete-modal
automations-list-active-inactive
automations-no-automations
automations-search-no-results

backends-extended

backend-add-blank-disabled
backend-add-cloud-advanced-open
backend-add-cloud-no-key-disabled
backend-add-cloud-with-key-enabled
backend-add-form-partially-filled
backend-add-invalid-url-disabled
backend-add-local-ready
backend-add-name-only-disabled
backend-add-two-column-layout
backend-add-whitespace-host-disabled
backend-cancel-nothing-saved
backend-dropdown-two-backends
backend-edit-prefilled
backend-manage-two-listed
backend-remove-cancelled
backend-remove-confirmation
backend-switch-overlay

backends

backend-add-modal
backend-manage-modal
backend-selector-open

changes-tab

changes-deleted-file
changes-diff-viewer
changes-empty

collapsible-thinking

reasoning-content-collapsed
reasoning-content-expanded
think-action-collapsed
think-action-expanded

mcp-page

mcp-custom-server-2-url-filled
mcp-custom-server-3-all-filled
mcp-custom-server-4-installed
mcp-slack-install-2-modal
mcp-slack-install-3-filled
mcp-slack-install-4-installed

onboarding

onboarding-step-0-choose-agent
onboarding-step-1-check-backend
onboarding-step-2-setup-llm
onboarding-step-3-say-hello

projects-workspace-browser

projects-workspace-browser

settings-page

add-backend-modal
analytics-consent-modal
home-screen
settings-app-page

settings-secrets

secrets-add-form-filled
secrets-add-form
secrets-after-save
secrets-delete-confirm
secrets-list

settings-verification

condenser-settings
verification-settings-off
verification-settings-on

sidebar

sidebar-collapsed
sidebar-conversation-panel
sidebar-filter-menu

skills-page

skills-empty
skills-loaded
skills-no-match
skills-search-filtered
skills-type-filter

Generated by the Snapshot Tests workflow. This comment was created by an AI agent (OpenHands) on behalf of the repo maintainers.

vercel Bot deployed to Preview May 29, 2026 16:58 View deployment

malhotra5 added the e2e-tests Triggers mock-LLM E2E tests on PRs label May 29, 2026

Merge branch 'main' into test/mock-llm-coverage-511

9e54277

malhotra5 marked this pull request as ready for review May 29, 2026 17:03

malhotra5 requested a review from all-hands-bot May 29, 2026 17:03

github-actions Bot added a commit that referenced this pull request May 29, 2026

snapshot images for PR #940 run 26650711277

176db3c

vercel Bot deployed to Preview May 29, 2026 17:03 View deployment

all-hands-bot reviewed May 29, 2026

View reviewed changes

Comment thread tests/e2e/mock-llm/mock-llm-conversation.spec.ts Outdated

Comment thread tests/e2e/mock-llm/mock-llm-conversation.spec.ts Outdated

Comment thread tests/e2e/mock-llm/mock-llm-automation.spec.ts Outdated

github-actions Bot added a commit that referenced this pull request May 29, 2026

snapshot images for PR #940 run 26650873478

379066b

vercel Bot deployed to Preview May 29, 2026 17:12 View deployment

malhotra5 requested a review from all-hands-bot May 29, 2026 17:12

all-hands-bot previously requested changes May 29, 2026

View reviewed changes

Comment thread tests/e2e/mock-llm/mock-llm-conversation.spec.ts Outdated

Comment thread tests/e2e/mock-llm/mock-llm-conversation.spec.ts Outdated

github-actions Bot added a commit that referenced this pull request May 29, 2026

snapshot images for PR #940 run 26651268287

964a3f6

malhotra5 requested a review from all-hands-bot May 29, 2026 17:28

vercel Bot deployed to Preview May 29, 2026 17:28 View deployment

all-hands-bot reviewed May 29, 2026

View reviewed changes

Comment thread tests/e2e/mock-llm/mock-llm-conversation.spec.ts Outdated

Comment thread tests/e2e/mock-llm/mock-llm-conversation.spec.ts Outdated

github-actions Bot added a commit that referenced this pull request May 29, 2026

snapshot images for PR #940 run 26652019839

856cdbb

vercel Bot deployed to Preview May 29, 2026 18:19 View deployment

malhotra5 requested a review from all-hands-bot May 29, 2026 18:19

all-hands-bot reviewed May 29, 2026

View reviewed changes

Comment thread tests/e2e/mock-llm/mock-llm-conversation.spec.ts Outdated

fix: correct stale error message to reference page.on listener

3660d15

Co-authored-by: openhands <openhands@all-hands.dev>

github-actions Bot added a commit that referenced this pull request May 29, 2026

snapshot images for PR #940 run 26654433899

8c17586

Merge branch 'main' into test/mock-llm-coverage-511

a6a2c5c

vercel Bot deployed to Preview May 29, 2026 18:29 View deployment

tofarr approved these changes May 29, 2026

View reviewed changes

malhotra5 enabled auto-merge (squash) May 29, 2026 18:29

github-actions Bot added a commit that referenced this pull request May 29, 2026

snapshot images for PR #940 run 26654908588

9ec8c0e

malhotra5 merged commit 34b0da8 into main May 29, 2026
21 of 23 checks passed

malhotra5 deleted the test/mock-llm-coverage-511 branch May 29, 2026 18:49

Conversation

malhotra5 commented May 29, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

Summary

Issue Number

How to Test

Type

Uh oh!

vercel Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 29, 2026

✅ Mock-LLM E2E Tests

Uh oh!

all-hands-bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 29, 2026

✅ Mock-LLM E2E Tests

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Code Review — test(mock-llm): strengthen E2E coverage for conversation and automation flows

Summary

Issues

1. Broad URL filter may capture unintended POST requests

2. Hardcoded user message string in step 4

3. Brittle fallback strings in the cron schedule assertion

Nits / Non-blocking

Uh oh!

Uh oh!

Uh oh!

Uh oh!

all-hands-bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 29, 2026

✅ Mock-LLM E2E Tests

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Code Review — test(mock-llm): strengthen E2E coverage for conversation and automation flows

Prior Review Resolution

Remaining Issues

1. Misleading comment and redundant add() call at step 4 setup (mock-llm-conversation.spec.ts:407–408)

2. page.on("request", …) listener never removed (mock-llm-conversation.spec.ts:261) — nit

Summary

Uh oh!

Uh oh!

Uh oh!

all-hands-bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 29, 2026

✅ Mock-LLM E2E Tests

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Code Review — test(mock-llm): strengthen E2E coverage for conversation and automation flows

Summary

Uh oh!

Uh oh!

Uh oh!

all-hands-bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 29, 2026

✅ Mock-LLM E2E Tests

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Code Review — test(mock-llm): strengthen E2E coverage for conversation and automation flows

Summary

Uh oh!

Uh oh!

github-actions Bot commented May 29, 2026

⚠️ Mock-LLM E2E Tests

Uh oh!

tofarr left a comment

malhotra5 commented May 29, 2026 •

edited by github-actions Bot

Loading

vercel Bot commented May 29, 2026 •

edited

Loading

all-hands-bot commented May 29, 2026 •

edited

Loading

Code Review — `test(mock-llm): strengthen E2E coverage for conversation and automation flows`

all-hands-bot commented May 29, 2026 •

edited

Loading

Code Review — `test(mock-llm): strengthen E2E coverage for conversation and automation flows`

1. Misleading comment and redundant `add()` call at step 4 setup (`mock-llm-conversation.spec.ts:407–408`)

2. `page.on("request", …)` listener never removed (`mock-llm-conversation.spec.ts:261`) — nit

all-hands-bot commented May 29, 2026 •

edited

Loading

Code Review — `test(mock-llm): strengthen E2E coverage for conversation and automation flows`

all-hands-bot commented May 29, 2026 •

edited

Loading

Code Review — `test(mock-llm): strengthen E2E coverage for conversation and automation flows`

`backends-extended` — 2 snapshots

`mcp-page` — 5 snapshots

`settings-page`