feat(human): meet agent stage 1 — headless attendee via shared-cookie webview#1163
Conversation
… webview (tinyhumansai#1143) Adds the foundational plumbing for an OpenHuman agent that joins Google Meet as an independent participant. Stage 1 only: auto-join with mic/cam off, lifecycle events (joined/left/failed), and a graceful leave command. No avatar, audio, STT, or LLM loop yet (stages 2-5). New files: app/src-tauri/recipes/google-meet/agent.js — role-gated IIFE, polling loop, pure helpers exposed via window.__openhumanMeetAgent.pure app/test/meet-agent.test.ts — 30 Vitest tests for pure helpers app/src/services/meetAgent.ts — typed Tauri invoke wrappers + webview:event subscription filtered on agent event kinds app/src/services/meetAgent.test.ts — 11 Vitest service tests app/src/features/human/HumanPage.meetAgent.test.tsx — 8 Vitest UI tests docs/MEET_AGENT.md — architecture, selector contract, roadmap Modified: app/src-tauri/src/webview_accounts/mod.rs — MEET_AGENT_JS constant, agents: Mutex<HashMap> on WebviewAccountsState, agent_label_for(), build_agent_init_script(), webview_meet_agent_join + leave commands, 4 new Rust unit tests app/src-tauri/src/lib.rs — register the two new commands app/src/features/human/HumanPage.tsx — staging-only MeetAgentPanel
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughThis PR implements Stage 1 of a Google Meet “Meet Agent”: a hidden, role-gated off-screen webview that auto-navigates and attempts to join a Meet URL with mic/camera forced off, emits lifecycle events (joined/left/failed), exposes a small global agent API, adds backend Tauri commands to spawn/tear down per-account agent webviews, a TypeScript service and React staging panel, tests, and documentation. ChangesMeet Agent: Join, Event Emission, and Webview Lifecycle
Sequence DiagramsequenceDiagram
actor User
participant HumanPage as React UI (HumanPage)
participant Service as meetAgent.ts Service
participant Tauri as Tauri Backend (webview_accounts)
participant AgentWV as Hidden Webview (agent.js)
participant Meet as Google Meet
User->>HumanPage: Click "Join" (accountId, meetingUrl)
HumanPage->>Service: meetAgentJoin({accountId, meetingUrl})
Service->>Tauri: invoke("webview_meet_agent_join", args)
Tauri->>Tauri: validate host, ensure data dir, close prior agent
Tauri->>AgentWV: spawn off-screen webview & inject MEET_AGENT_JS
AgentWV->>AgentWV: role-gate (role === "agent")
AgentWV->>Meet: navigate to meetingUrl (if needed)
AgentWV->>AgentWV: poll DOM, mute mic/cam, click Join when available
AgentWV->>Tauri: emit "meet_agent_joined"
Tauri->>Service: relay via webview:event
Service->>HumanPage: invoke subscription handler
HumanPage->>HumanPage: update status display
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Review rate limit: 1/5 review remaining, refill in 44 minutes and 19 seconds. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (1)
app/src/services/meetAgent.ts (1)
143-155: ⚡ Quick winReplace the
listen(...).then(...)chain with an async wrapper.This path is otherwise fine, but it is the changed TS segment still using a
.then()chain. An async IIFE keeps the early-unsubscribe behavior the same and matches the repo's promise-handling rule.♻️ Proposed refactor
- void listen<WebviewEventEnvelope>('webview:event', evt => { + void (async () => { + try { + const unlisten = await listen<WebviewEventEnvelope>('webview:event', evt => { const { kind, payload, account_id } = evt.payload; if (!AGENT_EVENT_KINDS.has(kind)) return; log('[meet-agent] received event kind=%s accountId=%s', kind, account_id); try { if (kind === 'meet_agent_joined') { handler({ kind: 'meet_agent_joined', accountId: account_id, code: String(payload.code ?? ''), joinedAt: Number(payload.joinedAt ?? Date.now()), }); } else if (kind === 'meet_agent_left') { handler({ kind: 'meet_agent_left', accountId: account_id, reason: String(payload.reason ?? 'unknown'), }); } else if (kind === 'meet_agent_failed') { handler({ kind: 'meet_agent_failed', accountId: account_id, reason: String(payload.reason ?? 'unknown'), }); } } catch (err) { errLog('[meet-agent] handler threw: %o', err); } - }).then( - unlisten => { - if (!active) { - // Unsubscribe was called before the promise resolved. - unlisten(); - } else { - cancelFn = unlisten; - } - }, - err => { + }); + + if (!active) { + // Unsubscribe was called before the promise resolved. + unlisten(); + } else { + cancelFn = unlisten; + } + } catch (err) { errLog('[meet-agent] listen() failed: %o', err); } - ); + })();As per coding guidelines:
**/*.{ts,tsx,js,jsx}: Use async/await for promises instead of.then()chains.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/services/meetAgent.ts` around lines 143 - 155, Replace the listen(...).then(...) chain with an async IIFE that awaits listen() and preserves the early-unsubscribe behavior: call an immediately-invoked async function, inside try { const unlisten = await listen(...); if (!active) { unlisten(); } else { cancelFn = unlisten; } } catch (err) { errLog('[meet-agent] listen() failed: %o', err); } so you keep the same variables (cancelFn, active, errLog and listen) and error handling but use async/await instead of .then().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/src-tauri/recipes/google-meet/agent.js`:
- Around line 104-110: The fallback selector in findMicButton currently returns
any [data-is-muted] element which may be the camera or another control; update
findMicButton to validate that the candidate is actually the microphone control
(e.g., check its aria-label/title/textContent contains "microphone" or "mic"
case-insensitively) before returning it, and if it fails validation continue
searching other buttons; ensure isMicOn() will only inspect/click elements
returned by findMicButton so the agent cannot mistakenly toggle the camera or
another control.
- Around line 234-253: The agent currently forces navigation to meetingUrl when
currentCode is null before checking for auth/error pages, so change the control
flow in the poll routine (symbols: extractMeetingCode, currentCode, targetCode,
window.location.replace) to detect unjoinable screens first: call
isUnjoinableScreen(doc) and handle failedEmitted/emitOnce('meet_agent_failed',
...) and stopPolling() before performing window.location.replace(meetingUrl) (or
add a guard that if currentCode is null, run isUnjoinableScreen and only
navigate if it is not an auth/error screen), ensuring the sign-in-required case
is classified instead of immediately navigating away.
In `@app/src-tauri/src/webview_accounts/mod.rs`:
- Around line 559-561: The shutdown drain currently only clears
WebviewAccountsState.inner webviews; update drain_for_shutdown to also drain
WebviewAccountsState.agents by taking the agents Mutex<HashMap> (agents.lock()
or agents.get_mut()), iterating its values and invoking the same close logic
used for inner webviews to close each agent webview, then clear the map so
entries are removed; also update the test
drain_for_shutdown_clears_state_and_repeat_is_noop to assert agents is empty
after a shutdown and that repeat calls remain no-ops.
- Around line 1517-1524: The emit currently sends nav_label_clone as
"account_id" and uses the raw url; update the closure so it captures the real
account id (nav_account_id) instead of nav_label_clone and pass
redact_navigation_url(url) for the "url" field; specifically, modify the block
that calls nav_app_clone.emit to use "account_id": nav_account_id and "url":
redact_navigation_url(url) to match the user webview emit behavior.
In `@app/src/services/meetAgent.test.ts`:
- Around line 47-51: The beforeEach currently resets implementations but not
call history, causing cross-test call count leaks; update the beforeEach used in
this test file to clear/reset Vitest mocks (e.g., call mockInvoke.mockClear() or
mockInvoke.mockReset() and also clear mockIsTauri's history) so each test starts
with zero call count; locate the beforeEach block and add calls to reset the
mocks for mockInvoke and mockIsTauri (while keeping listeners.clear() and
existing mockResolvedValue/mockReturnValue setup) so meetAgentJoin and
meetAgentLeave assertions on toHaveBeenCalledOnce() are isolated.
In `@docs/MEET_AGENT.md`:
- Around line 27-31: The fenced code block in docs/MEET_AGENT.md that contains
window.__OPENHUMAN_RECIPE_CTX__ and the RUNTIME_JS/MEET_AGENT_JS comments is
missing a language tag; update the opening triple backticks to include a
language (e.g., replace ``` with ```js) so markdownlint MD040 is satisfied and
the block is properly annotated.
---
Nitpick comments:
In `@app/src/services/meetAgent.ts`:
- Around line 143-155: Replace the listen(...).then(...) chain with an async
IIFE that awaits listen() and preserves the early-unsubscribe behavior: call an
immediately-invoked async function, inside try { const unlisten = await
listen(...); if (!active) { unlisten(); } else { cancelFn = unlisten; } } catch
(err) { errLog('[meet-agent] listen() failed: %o', err); } so you keep the same
variables (cancelFn, active, errLog and listen) and error handling but use
async/await instead of .then().
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 26fb513f-b6b4-4973-bd2b-a5e4e3f368f8
📒 Files selected for processing (9)
app/src-tauri/recipes/google-meet/agent.jsapp/src-tauri/src/lib.rsapp/src-tauri/src/webview_accounts/mod.rsapp/src/features/human/HumanPage.meetAgent.test.tsxapp/src/features/human/HumanPage.tsxapp/src/services/meetAgent.test.tsapp/src/services/meetAgent.tsapp/test/meet-agent.test.tsdocs/MEET_AGENT.md
| /// account_id -> agent webview label. Tracks live agent webviews so we | ||
| /// can reuse or close them without scanning all open webviews. | ||
| agents: Mutex<HashMap<String, String>>, |
There was a problem hiding this comment.
Agent webviews not cleaned up during shutdown.
The agents map is added to WebviewAccountsState, but drain_for_shutdown (lines 576-625) does not drain this map or close the associated agent webviews. On app exit, agent webviews will leak while user webviews in inner are properly closed.
🐛 Proposed fix in `drain_for_shutdown`
Add agent draining alongside the existing inner draining:
+ // Drain agent webviews alongside user webviews
+ let agent_labels: Vec<(String, String)> = self
+ .agents
+ .lock()
+ .ok()
+ .map(|mut g| g.drain().collect())
+ .unwrap_or_default();
+
self.inner
.lock()
.ok()
- .map(|mut g| g.drain().collect())
+ .map(|mut g| {
+ let mut labels: Vec<_> = g.drain().collect();
+ labels.extend(agent_labels);
+ labels
+ })
.unwrap_or_default()And update the test drain_for_shutdown_clears_state_and_repeat_is_noop to verify agents are also drained.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/src-tauri/src/webview_accounts/mod.rs` around lines 559 - 561, The
shutdown drain currently only clears WebviewAccountsState.inner webviews; update
drain_for_shutdown to also drain WebviewAccountsState.agents by taking the
agents Mutex<HashMap> (agents.lock() or agents.get_mut()), iterating its values
and invoking the same close logic used for inner webviews to close each agent
webview, then clear the map so entries are removed; also update the test
drain_for_shutdown_clears_state_and_repeat_is_noop to assert agents is empty
after a shutdown and that repeat calls remain no-ops.
| let _ = nav_app_clone.emit( | ||
| "webview-account:navigate", | ||
| serde_json::json!({ | ||
| "account_id": nav_label_clone, | ||
| "provider": "google-meet-agent", | ||
| "url": url.as_str(), | ||
| }), | ||
| ); |
There was a problem hiding this comment.
Emitting webview label instead of account_id in navigation event.
Line 1520 emits nav_label_clone (the webview label, e.g., "acct_user-123_agent") as account_id, but the frontend expects the actual account_id (e.g., "user-123"). Compare to the user webview emit at lines 1756-1762 which correctly uses nav_account_id.
Additionally, line 1522 does not redact the URL unlike the user webview at line 1761 which calls redact_navigation_url(url).
🐛 Proposed fix
Capture the actual account_id in the closure and redact the URL:
+ let nav_account_id = account_id.to_string();
let nav_label_clone = label.clone();
let nav_app_clone = app.clone();
let builder = WebviewBuilder::new(label.clone(), WebviewUrl::External(parsed_url))
.data_directory(data_dir)
.initialization_script(&init_script)
.on_navigation(move |url| {
let allowed = matches!(
url.host_str(),
Some("meet.google.com") | Some("accounts.google.com") | Some("www.google.com")
);
if !allowed {
log::debug!(
"[meet-agent] on_navigation blocked url={} label={}",
url,
nav_label_clone
);
}
// Notify the frontend of navigation events for diagnostics.
let _ = nav_app_clone.emit(
"webview-account:navigate",
serde_json::json!({
- "account_id": nav_label_clone,
+ "account_id": nav_account_id,
"provider": "google-meet-agent",
- "url": url.as_str(),
+ "url": redact_navigation_url(url),
}),
);
allowed
});🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/src-tauri/src/webview_accounts/mod.rs` around lines 1517 - 1524, The emit
currently sends nav_label_clone as "account_id" and uses the raw url; update the
closure so it captures the real account id (nav_account_id) instead of
nav_label_clone and pass redact_navigation_url(url) for the "url" field;
specifically, modify the block that calls nav_app_clone.emit to use
"account_id": nav_account_id and "url": redact_navigation_url(url) to match the
user webview emit behavior.
| beforeEach(() => { | ||
| listeners.clear(); | ||
| mockInvoke.mockResolvedValue(undefined); | ||
| mockIsTauri.mockReturnValue(true); | ||
| }); |
There was a problem hiding this comment.
Reset shared Vitest mocks in beforeEach.
The beforeEach block resets mock implementations but does not clear call history. The first test calls meetAgentJoin() (invoking mockInvoke), and the second test then calls meetAgentLeave() (invoking mockInvoke again). Both tests assert expect(mockInvoke).toHaveBeenCalledOnce(), so without clearing the mock call history, the second test fails because the call count is 2, not 1.
🧪 Proposed fix
beforeEach(() => {
+ vi.clearAllMocks();
listeners.clear();
mockInvoke.mockResolvedValue(undefined);
mockIsTauri.mockReturnValue(true);
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| beforeEach(() => { | |
| listeners.clear(); | |
| mockInvoke.mockResolvedValue(undefined); | |
| mockIsTauri.mockReturnValue(true); | |
| }); | |
| beforeEach(() => { | |
| vi.clearAllMocks(); | |
| listeners.clear(); | |
| mockInvoke.mockResolvedValue(undefined); | |
| mockIsTauri.mockReturnValue(true); | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/src/services/meetAgent.test.ts` around lines 47 - 51, The beforeEach
currently resets implementations but not call history, causing cross-test call
count leaks; update the beforeEach used in this test file to clear/reset Vitest
mocks (e.g., call mockInvoke.mockClear() or mockInvoke.mockReset() and also
clear mockIsTauri's history) so each test starts with zero call count; locate
the beforeEach block and add calls to reset the mocks for mockInvoke and
mockIsTauri (while keeping listeners.clear() and existing
mockResolvedValue/mockReturnValue setup) so meetAgentJoin and meetAgentLeave
assertions on toHaveBeenCalledOnce() are isolated.
| ``` | ||
| window.__OPENHUMAN_RECIPE_CTX__ = { accountId, provider: "google-meet", role: "agent", meetingUrl }; | ||
| RUNTIME_JS // provides window.__openhumanRecipe (emit / log / loop) | ||
| MEET_AGENT_JS // the auto-join polling loop | ||
| ``` |
There was a problem hiding this comment.
Add a language to this fenced block.
Line 27 opens a bare code fence, so markdownlint will keep flagging MD040 here until the block is annotated, e.g. with js.
📝 Proposed fix
-```
+```js
window.__OPENHUMAN_RECIPE_CTX__ = { accountId, provider: "google-meet", role: "agent", meetingUrl };
RUNTIME_JS // provides window.__openhumanRecipe (emit / log / loop)
MEET_AGENT_JS // the auto-join polling loop</details>
<details>
<summary>🧰 Tools</summary>
<details>
<summary>🪛 markdownlint-cli2 (0.22.1)</summary>
[warning] 27-27: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
</details>
</details>
<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against the current code and only fix it if needed.
In @docs/MEET_AGENT.md around lines 27 - 31, The fenced code block in
docs/MEET_AGENT.md that contains window.OPENHUMAN_RECIPE_CTX and the
RUNTIME_JS/MEET_AGENT_JS comments is missing a language tag; update the opening
triple backticks to include a language (e.g., replace withjs) so
markdownlint MD040 is satisfied and the block is properly annotated.
</details>
<!-- fingerprinting:phantom:medusa:grasshopper:f7cfb4f9-20b9-4201-9d22-f002f4f12dad -->
<!-- d98c2f50 -->
<!-- This is an auto-generated comment by CodeRabbit -->
Hardens the Google Meet auto-join agent against DOM renames by porting
Vexa-ai/vexa's selector arrays (joinButton, microphoneToggle, cameraToggle,
primaryLeave, secondaryLeave, admissionIndicators, initialAdmissionIndicators,
waitingRoomIndicators, rejectionIndicators, participantContainers,
meetingContainer, nameInput) wholesale.
Adds a small query helper toolkit — queryByCssOrText / queryAllByCssOrText /
firstFromList — that translates Playwright selector syntax (:has-text,
text=, text*=, XPath) to plain document.querySelector / document.evaluate
calls, since the script runs as injected JS with no Playwright runtime.
Behavioural changes:
- isInCall now walks both initialAdmissionIndicators (strict, lobby-safe)
and admissionIndicators (broader) instead of two hard-coded selectors.
- isUnjoinableScreen now walks SELECTORS.rejectionIndicators and returns
one of five stable reason strings mapped per-selector.
- Adds isInWaitingRoom: polling loop treats this as non-terminal — keeps
retrying without emitting meet_agent_failed.
- findLeaveButton walks primaryLeave then secondaryLeave (confirmation dialog).
- isMicOn / isCamOn updated with Vexa's aria-label heuristic ("Turn off X"
means currently on; "Turn on X" means currently off).
- Join timeout extended from 60 s → 120 s to allow host admission latency.
- State-machine transitions logged at [meet-agent] state: X → Y.
Tests: meet-agent.test.ts grows from 22 → 70 tests, covering all four
selector forms (plain CSS, :has-text, XPath, text=/text*=), firstFromList
fallthrough, isInWaitingRoom fixtures, all five rejection reason strings, and
both primary + secondary leave paths.
docs/MEET_AGENT.md: adds Known limitations (Switch here not handled) and
References section attributing Vexa. Relates to PR tinyhumansai#1163.
Selector hardening via Vexa portThis commit ports Vexa-ai/vexa's Google Meet selector library ( What changed
What's deferred"Switch here" (shown when the user is already in the same meeting from another device or tab) is deliberately not handled — it requires separate logic and is tracked for a follow-up. |
Summary
meet_agent_joined,meet_agent_left,meet_agent_failed).webview_meet_agent_join(spawn + auto-join) andwebview_meet_agent_leave(graceful leave + close).meetAgent.tsservice with typed invokes and asubscribeMeetAgentEventslistener filtered on agent event kinds over the existingwebview:eventchannel.MeetAgentPaneldev UI in HumanPage (APP_ENVIRONMENT !== 'production') with account-id/meeting-url inputs, Join/Leave buttons, and a live status line.docs/MEET_AGENT.mdcovering architecture, selector contract, and the stage 2–5 roadmap.Problem
Issue #1143: OpenHuman needs to join Google Meet as an independent participant it controls end-to-end. Stage 1 proves the agent can authenticate, navigate, and become an attendee without user interaction, as a precondition for adding avatar, TTS, STT, and the LLM loop in later stages.
Solution
data_directory_for(app, account_id)— the same path as the user's existing google-meet webview — so Google session cookies are shared and no re-login is needed. The webview uses a different label (acct_{id}_agent) to stay independent.agent.jsis a role-gated IIFE (__OPENHUMAN_RECIPE_CTX__.role === "agent"required). It polls every 1 s with a 60 s timeout: mutes mic/cam, clicks Join, watches for in-call presence signals, and emits lifecycle events via the existingwindow.__openhumanReciperuntime bridge. Pure helpers are exposed onwindow.__openhumanMeetAgent.purefor Vitest.on_navigationin the agent webview allows onlymeet.google.com,accounts.google.com, andwww.google.com; everything else is blocked.Submission Checklist
Impact
WebviewAccountsStategains one newMutex<HashMap>field (agents); no migration needed.agent.jsare "stable-ish" and will need periodic maintenance likerecipe.js— flagged indocs/MEET_AGENT.md.No real-meeting smoke test was run — there is no way to verify a live Meet session in this environment. The join flow depends on DOM selectors that are only observable in a real browser, and was designed conservatively (60s timeout, explicit mic/cam muting, unjoinable-screen detection).
Related
Summary by CodeRabbit
New Features
Documentation
Tests