Web: tapOn into Flutter Web semantics — finder + hit-target fixes#74
Conversation
|
@richjun Thanks for digging into this — the three-layer analysis (finder rejection + pre-flight concession + post-click concession) is exactly right, and confirming with Occluded/Transformed tests that the Flutter concession doesn't relax legitimate rejection Quick heads up: while this PR has been open, I refactored the 5-iteration poll loop out of To unblock, could you rebase against latest
Your One optional cleanup while you're in there: |
|
@omnarayan main HEAD is |
|
@richjun Apologies, Thanks for flagging. |
…...)
Rod's page.Search() uses CDP DOM.performSearch which matches against the
serialized HTML of every node — including the source text inside <script>
and <style> blocks. On a page whose JS source happens to contain a button
label as a literal string (e.g. "Close" in `await page.click('Close')`),
the cascade in findByText silently returned the <script> element itself
instead of falling through to the JS findByText path that walks shadow
roots and same-origin iframes.
The wrong-element pick had two visible failure modes:
- Click coordinates resolve to the <script>'s zero-size box (or page
origin), then the hit-target verifier reports occlusion by whatever
real content sits there ("blocked by overlay (<flutter-view> …)" was
the symptom on the createboard test). Non-deterministic depending on
timing of which node the AX-tree cascade resolves first.
- tapOn returned success silently when no hit-target check was active,
landing the click on nonsense coordinates and reporting fake green.
Extends the existing IFRAME/FRAME rejection in findBySearch to also drop
SCRIPT, STYLE, TEMPLATE, NOSCRIPT, TITLE, META, LINK, HEAD — any tag
whose text content is source/inert and never represents a tappable user
affordance. The cascade then falls through to JS findByText which walks
_collectRoots() and returns the actual visible match.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Playwright-port hit-target check in expectHitTarget walks roots from
the target element outward and asserts that elementsFromPoint at the
click coordinate returns an ancestor of the target. That assumption
holds for normal HTML and for shadow-DOM widgets, but it is structurally
wrong for Flutter Web semantics:
- <flutter-view> (Flutter 3.x) acts as a glass pane that intercepts
every pointer event and routes it to the appropriate <flt-semantics>
via Flutter's own internal hit testing.
- The accessibility tree may live in light DOM under flutter-view, in
its shadow root, or as a sibling <flt-semantics-host> with the
rendering canvas (flt-scene-host) stacked above. In all of these
layouts elementsFromPoint at a semantics target returns flutter-view
(or its canvas), never the semantics node.
So a strict same-element walk-up always reports false occlusion and
refuses to dispatch — observable as intermittent
"Click on text=\"Close\" blocked by overlay (<flutter-view> …)" failures
on real Flutter Web pages whenever timing made the canvas the topmost
hit (vs occasionally finding the semantics directly).
Adds a concession at the end of expectHitTarget: when both target and
the topmost hit element live inside the same Flutter app's iframe (any
ancestor with `flutter-view` or an `flt-*` tag, walked across shadow
boundaries), accept the dispatch. Flutter routes the trusted click to
the right semantics action. Non-Flutter occlusion (overlay div, modal,
genuine z-stack) continues to reject as before — the Occluded /
Transformed regression tests still fail-fast.
The 5 existing TestTapOnCrossRoot_* iframe tests continue to pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nsumption Companion to 1be423d (pre-flight Flutter concession). The pre-flight expectHitTarget walk was made permissive for Flutter Web semantics — <flutter-view> / <flt-glass-pane> sit on top of <flt-semantics> at the DOM layer and elementsFromPoint always returns the canvas, so a strict walk-up was a false-occlusion source. That fix lets dispatch proceed. The post-click verifier was still strict, and that asymmetry is the bug. setupHitTargetInterceptor installs a one-shot addEventListener('pointerdown'|'mousedown', listener, true) on the target frame's window and pollHitTargetResult retries 5 × 20ms for the trusted event to fire so it can re-run expectHitTarget against the actual fire-time clientX/Y. For Flutter targets this listener never fires: Flutter's pointer router intercepts trusted pointer events at the document/flutter-view capture layer, routes them into its own internal hit testing for semantics dispatch, and does not re-emit them at the window level for third-party listeners. So every cross-iframe tap on a Flutter semantics node — Welcome dialog Close, More Tools menu items, color picker swatches — failed with "Click on text=\"Close\" dispatched but verification timed out", even though Chromium delivered the trusted click to the right coordinates and Flutter handled it. Mirror the pre-flight Flutter concession in the verifier: - jshelper.js: setupHitTargetInterceptor records inFlutter on the state object. pollHitTargetResult is also unified to always return a single object shape (was bare 'pending' / 'done' strings mixed with { hitTargetDescription } objects — four shapes the Go side had to branch on). Now exactly three: { status: 'done' }, { status: 'pending', inFlutter }, { status: 'failed', hitTargetDescription }. - commands.go: the post-click Flutter concession lives in the shared dispatchCrossRoot helper (introduced in 9f17508). When poll budget exhausts and inFlutter is true, accept the dispatch — pre-flight already validated the static hit point and Flutter's own hit testing handled the trusted click. Living in dispatchCrossRoot means doubleTapOn / longPressOn / scrollUntilVisible inherit the concession for free, since they share this path. Non-Flutter timeouts continue to fail-fast as before, so the "click never landed" detection on real DOM stays intact. Verified end-to-end on createboard.lgbusinesscloud.com Board Editor: flow now passes through Welcome dialog dismiss → Pen panel → Red color → Close panel → More Tools → Help → Gesture page → Select Object. Pre-fix: failed at the first Flutter semantics tap with verifier timeout. The five existing TestTapOnCrossRoot_* tests (TopFrame / Iframe / IframeShadow / Occluded / Transformed) continue to pass — the Flutter concession does not relax non-Flutter occlusion rejection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
d0e35ec to
0a3f4c3
Compare
|
@omnarayan Rebased on All 5 |
|
Thanks @richjun — clean rebase, both the move into |
Summary
Three related fixes that together let
tapOn: <text>work reliably on Flutter Web pages whose accessibility tree (<flt-semantics>) is rendered inside an iframe.All three fixes are surgical and cover orthogonal layers; any one alone is insufficient.
1.
findBySearchrejects non-tappable text containers (ca2ae96)Rod's
page.Search()uses CDPDOM.performSearchwhich matches against the serialized HTML of every node — including the source text inside<script>and<style>blocks. On a page whose JS source happens to contain a button label as a literal string (e.g."Close"inawait page.click('Close')), the cascade infindByTextsilently returned the<script>element itself instead of falling through to the JSfindByTextpath that walks shadow roots and same-origin iframes.The wrong-element pick had two visible failure modes:
<script>'s zero-size box, then the hit-target verifier reports occlusion by whatever real content sits at the page origin (Click on text="…" blocked by overlay (<flutter-view> …)). Non-deterministic depending on AX-tree timing.tapOnreturned success silently when no hit-target check was active, landing the click on nonsense coordinates and reporting fake green.Extends the existing
IFRAME/FRAMErejection to also dropSCRIPT,STYLE,TEMPLATE,NOSCRIPT,TITLE,META,LINK,HEAD. The cascade then falls through to JSfindByTextwhich walks_collectRoots()and returns the actual visible match.2. Hit-target pre-flight accepts Flutter Web glass-pane occlusion (
1be423d)The Playwright-port hit-target check in
expectHitTargetwalks roots from the target outward and asserts thatelementsFromPointat the click coordinate returns an ancestor of the target. That assumption holds for normal HTML and for shadow-DOM widgets, but is structurally wrong for Flutter Web semantics:<flutter-view>(Flutter 3.x) acts as a glass pane that intercepts every pointer event and routes it to the appropriate<flt-semantics>via Flutter's own internal hit testing.<flt-semantics-host>with the rendering canvas (flt-scene-host) stacked above. In all of these layoutselementsFromPointat a semantics target returns flutter-view (or its canvas), never the semantics node.Result: a strict same-element walk-up always reports false occlusion and refuses to dispatch — observable as intermittent
Click on text="…" blocked by overlay (<flutter-view> …)failures whenever timing made the canvas the topmost hit (vs occasionally finding the semantics directly).Adds a concession at the end of
expectHitTarget: when both target and the topmost hit element live inside the same Flutter app's iframe (any ancestor withflutter-viewor anflt-*tag, walked across shadow boundaries), accept the dispatch. Flutter routes the trusted click to the right semantics action. Non-Flutter occlusion (overlay div, modal, genuine z-stack) continues to reject as before — the Occluded / Transformed regression tests still fail-fast.3. Hit-target post-click verifier accepts Flutter Web glass-pane consumption (
d0e35ec)Companion to (2). Pre-flight
expectHitTargetwas made permissive for Flutter glass-pane and dispatch could proceed — but the post-click verifier was still strict, and that asymmetry is the actual production bug.setupHitTargetInterceptorinstalls a one-shotaddEventListener('pointerdown'|'mousedown', listener, true)on the target frame's window andpollHitTargetResultretries 5 × 20ms for the trusted event to fire so it can re-runexpectHitTargetagainst the actual fire-timeclientX/Y. For Flutter targets this listener never fires: Flutter's pointer router intercepts trusted pointer events at the document/flutter-view capture layer, routes them into its own internal hit testing for semantics dispatch, and does not re-emit them at the window level for third-party listeners. So every cross-iframe tap on a Flutter semantics node failed withClick on text="…" dispatched but verification timed out, even though Chromium delivered the trusted click to the right coordinates and Flutter handled it.Mirror the pre-flight Flutter concession in the verifier:
jshelper.js:setupHitTargetInterceptorrecordsinFlutteron the state object;pollHitTargetResultreturns{ status: 'pending', inFlutter }while the listener hasn't fired (was bare'pending'string).commands.go: when poll budget exhausts andinFlutteris true, accept the dispatch — pre-flight already validated the static hit point and Flutter's own hit testing handled the trusted click. Non-Flutter timeouts continue to fail-fast as before, so the "click never landed" detection on real DOM stays intact.Test plan
go test ./pkg/driver/browser/cdp/ -run TestTapOnCrossRoot_*— all 5 iframe tests pass (TopFrame / Iframe / IframeShadow / Occluded / Transformed)<flt-semantics>node (dialog dismiss, toolbar buttons, panel controls, menu items) now passes. Pre-fix: failed at the first Flutter semantics tap with verifier timeout.🤖 Generated with Claude Code