test(browser): regression tests for issue #723 error clarity#1686
Open
sena-labs wants to merge 1 commit into
Open
test(browser): regression tests for issue #723 error clarity#1686sena-labs wants to merge 1 commit into
sena-labs wants to merge 1 commit into
Conversation
Guard against reintroducing the vague 'Task reached step limit without completion. Last page: about:blank' message that the removed browser-use agent produced when the underlying LLM provider rejected structured-output headers (e.g. 'anthropic-beta: structured-outputs-2025-11-13'). The current Playwright-based browser tool returns specific, actionable messages for each failure mode: - Runtime unavailable → 'Browser runtime unavailable: <reason>' - Action call failure → 'Browser <action> failed: <reason>' - Unknown action → 'Unknown browser action: <name>' Three new tests verify these contracts and assert the old confusing wording is absent. Closes agent0ai#723 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
tests/test_browser_agent_regressions.pythat guard against reintroducing the confusing"Task reached step limit without completion. Last page: about:blank"error reported in Problem with the browsing tool #723.get_runtimeto simulate a different failure mode of the Playwright-based browser tool and asserts that the returned message is specific, actionable, and does not contain the old"step limit"/"about:blank"wording.Background
Issue #723 reported that every browser task since v0.9.4 silently failed with:
Root cause: the removed
browser-uselibrary sent ananthropic-beta: structured-outputs-2025-11-13header to OpenRouter, which proxied requests to providers (Amazon Bedrock, Google Vertex) that reject it. After three consecutive API errors,browser-useexhausted its step budget and returned the generic fallback message above — giving users no indication that the real problem was an API rejection.The fix (commit
983d431a) replaced the browser-use agent with the native Playwright_browserplugin, which propagates concrete error messages:Browser runtime unavailable: <reason>Browser <action> failed: <reason>Unknown browser action: <name>The existing
test_legacy_browser_dependency_is_removedtest already verifies thatbrowser-useand_browser_agentare gone. This PR adds complementary tests that verify the new tool's error-handling contracts, so the old vague wording cannot be reintroduced undetected.Tests added
test_browser_tool_returns_clear_error_when_runtime_unavailableget_runtimeraises → message contains"runtime unavailable", not"step limit"test_browser_action_failure_names_the_failed_actionruntime.callraises → message names the failing action and contains"failed"test_browser_unknown_action_returns_informative_messageTest plan
All four tests pass. No other tests are modified.
Closes #723
🤖 Generated with Claude Code