Skip to content

Web: tapOn into Flutter Web semantics — finder + hit-target fixes#74

Merged
omnarayan merged 3 commits into
devicelab-dev:mainfrom
richjun:fix/web-tapon-flutter-semantics
May 12, 2026
Merged

Web: tapOn into Flutter Web semantics — finder + hit-target fixes#74
omnarayan merged 3 commits into
devicelab-dev:mainfrom
richjun:fix/web-tapon-flutter-semantics

Conversation

@richjun
Copy link
Copy Markdown
Contributor

@richjun richjun commented May 9, 2026

Summary

Three related fixes that together let tapOn: <text> work reliably on Flutter Web pages whose accessibility tree (<flt-semantics>) is rendered inside an iframe.

All three fixes are surgical and cover orthogonal layers; any one alone is insufficient.

1. findBySearch rejects non-tappable text containers (ca2ae96)

Rod's page.Search() uses CDP DOM.performSearch which matches against the serialized HTML of every node — including the source text inside <script> and <style> blocks. On a page whose JS source happens to contain a button label as a literal string (e.g. "Close" in await page.click('Close')), the cascade in findByText silently returned the <script> element itself instead of falling through to the JS findByText path that walks shadow roots and same-origin iframes.

The wrong-element pick had two visible failure modes:

  • Click coordinates resolve to the <script>'s zero-size box, then the hit-target verifier reports occlusion by whatever real content sits at the page origin (Click on text="…" blocked by overlay (<flutter-view> …)). Non-deterministic depending on AX-tree timing.
  • tapOn returned success silently when no hit-target check was active, landing the click on nonsense coordinates and reporting fake green.

Extends the existing IFRAME/FRAME rejection to also drop SCRIPT, STYLE, TEMPLATE, NOSCRIPT, TITLE, META, LINK, HEAD. The cascade then falls through to JS findByText which walks _collectRoots() and returns the actual visible match.

2. Hit-target pre-flight accepts Flutter Web glass-pane occlusion (1be423d)

The Playwright-port hit-target check in expectHitTarget walks roots from the target outward and asserts that elementsFromPoint at the click coordinate returns an ancestor of the target. That assumption holds for normal HTML and for shadow-DOM widgets, but is structurally wrong for Flutter Web semantics:

  • <flutter-view> (Flutter 3.x) acts as a glass pane that intercepts every pointer event and routes it to the appropriate <flt-semantics> via Flutter's own internal hit testing.
  • The accessibility tree may live in light DOM under flutter-view, in its shadow root, or as a sibling <flt-semantics-host> with the rendering canvas (flt-scene-host) stacked above. In all of these layouts elementsFromPoint at a semantics target returns flutter-view (or its canvas), never the semantics node.

Result: a strict same-element walk-up always reports false occlusion and refuses to dispatch — observable as intermittent Click on text="…" blocked by overlay (<flutter-view> …) failures whenever timing made the canvas the topmost hit (vs occasionally finding the semantics directly).

Adds a concession at the end of expectHitTarget: when both target and the topmost hit element live inside the same Flutter app's iframe (any ancestor with flutter-view or an flt-* tag, walked across shadow boundaries), accept the dispatch. Flutter routes the trusted click to the right semantics action. Non-Flutter occlusion (overlay div, modal, genuine z-stack) continues to reject as before — the Occluded / Transformed regression tests still fail-fast.

3. Hit-target post-click verifier accepts Flutter Web glass-pane consumption (d0e35ec)

Companion to (2). Pre-flight expectHitTarget was made permissive for Flutter glass-pane and dispatch could proceed — but the post-click verifier was still strict, and that asymmetry is the actual production bug.

setupHitTargetInterceptor installs a one-shot addEventListener('pointerdown'|'mousedown', listener, true) on the target frame's window and pollHitTargetResult retries 5 × 20ms for the trusted event to fire so it can re-run expectHitTarget against the actual fire-time clientX/Y. For Flutter targets this listener never fires: Flutter's pointer router intercepts trusted pointer events at the document/flutter-view capture layer, routes them into its own internal hit testing for semantics dispatch, and does not re-emit them at the window level for third-party listeners. So every cross-iframe tap on a Flutter semantics node failed with Click on text="…" dispatched but verification timed out, even though Chromium delivered the trusted click to the right coordinates and Flutter handled it.

Mirror the pre-flight Flutter concession in the verifier:

  • jshelper.js: setupHitTargetInterceptor records inFlutter on the state object; pollHitTargetResult returns { status: 'pending', inFlutter } while the listener hasn't fired (was bare 'pending' string).
  • commands.go: when poll budget exhausts and inFlutter is true, accept the dispatch — pre-flight already validated the static hit point and Flutter's own hit testing handled the trusted click. Non-Flutter timeouts continue to fail-fast as before, so the "click never landed" detection on real DOM stays intact.

Test plan

  • go test ./pkg/driver/browser/cdp/ -run TestTapOnCrossRoot_* — all 5 iframe tests pass (TopFrame / Iframe / IframeShadow / Occluded / Transformed)
  • Validated end-to-end on a real Flutter Web app whose semantics tree lives inside a same-origin iframe — every tap on a <flt-semantics> node (dialog dismiss, toolbar buttons, panel controls, menu items) now passes. Pre-fix: failed at the first Flutter semantics tap with verifier timeout.
  • Reviewer: confirm Occluded / Transformed regressions still reject correctly on real overlay scenarios outside Flutter contexts.

🤖 Generated with Claude Code

@omnarayan
Copy link
Copy Markdown
Contributor

@richjun Thanks for digging into this — the three-layer analysis (finder rejection + pre-flight concession + post-click concession) is exactly right, and confirming with Occluded/Transformed tests that the Flutter concession doesn't relax legitimate rejection
is the load-bearing safety check.

Quick heads up: while this PR has been open, I refactored the 5-iteration poll loop out of tapOnCrossRoot into a shared dispatchCrossRoot(elem, info, desc, verbed, dispatch) helper (commit 9f17508 on main), so the same path is now used by
doubleTapOn and longPressOn. As a result, the patch no longer applies cleanly to maingit apply --check fails on pkg/driver/browser/cdp/commands.go.

To unblock, could you rebase against latest main and move the post-click Flutter concession into dispatchCrossRoot instead of tapOnCrossRoot? Concretely:

  • tapOnCrossRoot is now a ~4-line wrapper at commands.go:72 that just calls dispatchCrossRoot(...).
  • The 5-iteration poll loop you're patching lives in dispatchCrossRoot (commands.go:112). Your inFlutter flag + the post-loop if inFlutter { return successResult(...) } slot in there cleanly.
  • Once moved, the Flutter concession automatically benefits doubleTapOn and longPressOn for free — same Flutter pointer-router consumption issue, same fix.

Your jshelper.js (+43/-3) and finder.go (+20/-11) changes apply cleanly, no work needed there.

One optional cleanup while you're in there: pollHitTargetResult now returns four distinct value shapes ('done' | 'pending' string | { status: 'pending', inFlutter } object | { hitTargetDescription } object), which the Go side handles by
branching on v.Has("status") then falling through to a v.Str() switch. Works, but mildly ugly. Up to you whether to unify to a single object-always shape during the rebase — happy to merge either way.

@richjun
Copy link
Copy Markdown
Contributor Author

richjun commented May 11, 2026

@omnarayan main HEAD is a21bf5c here and I can't find 9f17508 / dispatchCrossRoot in the repo — did the refactor get pushed? Once it's reachable I'll rebase and move the concession into the helper.

@omnarayan
Copy link
Copy Markdown
Contributor

@richjun Apologies, 9f17508 was sitting on my local main and hadn't been pushed yet. Just pushed it (along with the visibility-check follow-up aee41cc that builds on it). After a git fetch origin, you should see dispatchCrossRoot in
pkg/driver/browser/cdp/commands.go:112 and tapOnCrossRoot as the ~4-line wrapper at line 72.

Thanks for flagging.

richjun and others added 3 commits May 12, 2026 07:11
…...)

Rod's page.Search() uses CDP DOM.performSearch which matches against the
serialized HTML of every node — including the source text inside <script>
and <style> blocks. On a page whose JS source happens to contain a button
label as a literal string (e.g. "Close" in `await page.click('Close')`),
the cascade in findByText silently returned the <script> element itself
instead of falling through to the JS findByText path that walks shadow
roots and same-origin iframes.

The wrong-element pick had two visible failure modes:
  - Click coordinates resolve to the <script>'s zero-size box (or page
    origin), then the hit-target verifier reports occlusion by whatever
    real content sits there ("blocked by overlay (<flutter-view> …)" was
    the symptom on the createboard test). Non-deterministic depending on
    timing of which node the AX-tree cascade resolves first.
  - tapOn returned success silently when no hit-target check was active,
    landing the click on nonsense coordinates and reporting fake green.

Extends the existing IFRAME/FRAME rejection in findBySearch to also drop
SCRIPT, STYLE, TEMPLATE, NOSCRIPT, TITLE, META, LINK, HEAD — any tag
whose text content is source/inert and never represents a tappable user
affordance. The cascade then falls through to JS findByText which walks
_collectRoots() and returns the actual visible match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Playwright-port hit-target check in expectHitTarget walks roots from
the target element outward and asserts that elementsFromPoint at the
click coordinate returns an ancestor of the target. That assumption
holds for normal HTML and for shadow-DOM widgets, but it is structurally
wrong for Flutter Web semantics:

  - <flutter-view> (Flutter 3.x) acts as a glass pane that intercepts
    every pointer event and routes it to the appropriate <flt-semantics>
    via Flutter's own internal hit testing.
  - The accessibility tree may live in light DOM under flutter-view, in
    its shadow root, or as a sibling <flt-semantics-host> with the
    rendering canvas (flt-scene-host) stacked above. In all of these
    layouts elementsFromPoint at a semantics target returns flutter-view
    (or its canvas), never the semantics node.

So a strict same-element walk-up always reports false occlusion and
refuses to dispatch — observable as intermittent
"Click on text=\"Close\" blocked by overlay (<flutter-view> …)" failures
on real Flutter Web pages whenever timing made the canvas the topmost
hit (vs occasionally finding the semantics directly).

Adds a concession at the end of expectHitTarget: when both target and
the topmost hit element live inside the same Flutter app's iframe (any
ancestor with `flutter-view` or an `flt-*` tag, walked across shadow
boundaries), accept the dispatch. Flutter routes the trusted click to
the right semantics action. Non-Flutter occlusion (overlay div, modal,
genuine z-stack) continues to reject as before — the Occluded /
Transformed regression tests still fail-fast.

The 5 existing TestTapOnCrossRoot_* iframe tests continue to pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nsumption

Companion to 1be423d (pre-flight Flutter concession). The pre-flight
expectHitTarget walk was made permissive for Flutter Web semantics —
<flutter-view> / <flt-glass-pane> sit on top of <flt-semantics> at the
DOM layer and elementsFromPoint always returns the canvas, so a strict
walk-up was a false-occlusion source. That fix lets dispatch proceed.

The post-click verifier was still strict, and that asymmetry is the
bug. setupHitTargetInterceptor installs a one-shot
addEventListener('pointerdown'|'mousedown', listener, true) on the
target frame's window and pollHitTargetResult retries 5 × 20ms for the
trusted event to fire so it can re-run expectHitTarget against the
actual fire-time clientX/Y. For Flutter targets this listener never
fires: Flutter's pointer router intercepts trusted pointer events at
the document/flutter-view capture layer, routes them into its own
internal hit testing for semantics dispatch, and does not re-emit them
at the window level for third-party listeners. So every cross-iframe
tap on a Flutter semantics node — Welcome dialog Close, More Tools
menu items, color picker swatches — failed with
"Click on text=\"Close\" dispatched but verification timed out", even
though Chromium delivered the trusted click to the right coordinates
and Flutter handled it.

Mirror the pre-flight Flutter concession in the verifier:
  - jshelper.js: setupHitTargetInterceptor records inFlutter on the
    state object. pollHitTargetResult is also unified to always return
    a single object shape (was bare 'pending' / 'done' strings mixed
    with { hitTargetDescription } objects — four shapes the Go side had
    to branch on). Now exactly three: { status: 'done' },
    { status: 'pending', inFlutter }, { status: 'failed',
    hitTargetDescription }.
  - commands.go: the post-click Flutter concession lives in the shared
    dispatchCrossRoot helper (introduced in 9f17508). When poll budget
    exhausts and inFlutter is true, accept the dispatch — pre-flight
    already validated the static hit point and Flutter's own hit
    testing handled the trusted click. Living in dispatchCrossRoot
    means doubleTapOn / longPressOn / scrollUntilVisible inherit the
    concession for free, since they share this path.
    Non-Flutter timeouts continue to fail-fast as before, so the
    "click never landed" detection on real DOM stays intact.

Verified end-to-end on createboard.lgbusinesscloud.com Board Editor:
flow now passes through Welcome dialog dismiss → Pen panel → Red color
→ Close panel → More Tools → Help → Gesture page → Select Object.
Pre-fix: failed at the first Flutter semantics tap with verifier
timeout.

The five existing TestTapOnCrossRoot_* tests (TopFrame / Iframe /
IframeShadow / Occluded / Transformed) continue to pass — the Flutter
concession does not relax non-Flutter occlusion rejection.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@richjun richjun force-pushed the fix/web-tapon-flutter-semantics branch from d0e35ec to 0a3f4c3 Compare May 11, 2026 22:31
@richjun
Copy link
Copy Markdown
Contributor Author

richjun commented May 11, 2026

@omnarayan Rebased on aee41cc. Moved the post-click Flutter concession into dispatchCrossRootdoubleTapOn / longPressOn / scrollUntilVisible inherit it for free. Also unified pollHitTargetResult to a single object shape ({ status: 'done' | 'pending' | 'failed', ... }).

All 5 TestTapOnCrossRoot_* pass (incl. Occluded / Transformed).

@omnarayan
Copy link
Copy Markdown
Contributor

Thanks @richjun — clean rebase, both the move into dispatchCrossRoot and the pollHitTargetResult shape unification are exactly what I was hoping for. Bonus that doubleTapOn / longPressOn / scrollUntilVisible now inherit the Flutter
concession for free.

@omnarayan omnarayan merged commit ff4a9a0 into devicelab-dev:main May 12, 2026
2 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants