Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,10 +100,10 @@ Every session is backed by a real PTY (`node-pty`) and an append-only event log.

Rendering uses Ghostty's terminal engine through two interchangeable backends (`--renderer`):

- **`libghostty-vt`** β€” Ghostty's native VT engine, bound into Node. Fast, browser-free semantic snapshots and `wait` checks. Also powers the dashboard.
- **`ghostty-web`** (default) β€” a headless web build of Ghostty driven by Playwright/Chromium. Adds pixel PNG screenshots and WebM video.
- **`libghostty-vt`** β€” Ghostty's native VT engine, bound into Node. It is the default for semantic snapshots, render-backed `wait` checks, and screen hashes when the optional native package is available. It also powers the dashboard.
- **`ghostty-web`** β€” a headless web build of Ghostty driven by Playwright/Chromium. It is the default for visual PNG screenshots and WebM video, and it remains the automatic semantic fallback when `libghostty-vt` is unavailable.

`ghostty-web` is a _reference_ renderer: it shows what a pinned Ghostty build draws, not a pixel-for-pixel guarantee of any particular native terminal window. That tradeoff is deliberate. The renderer sits behind an adapter, so native backends can be added later without changing the CLI contract.
`ghostty-web` is a _reference_ visual renderer: it shows what a pinned Ghostty build draws, not a pixel-for-pixel guarantee of any particular native terminal window. That tradeoff is deliberate. The renderer sits behind an adapter, so additional backends can be used without changing the CLI contract. Set `--renderer ghostty-web`, `AGENT_TTY_RENDERER=ghostty-web`, or `config.json` `defaultRenderer` to restore legacy all-`ghostty-web` behavior.

## Where it came from

Expand Down Expand Up @@ -141,6 +141,7 @@ See [`docs/AGENT-SKILLS.md`](./docs/AGENT-SKILLS.md).
`agent-tty` is `0.4.3` and focused on reliable, isolated, reviewable terminal and TUI automation through a stable CLI. <!-- x-release-please-version -->

- Linux and macOS are tier-1; Windows is tier-2 and not CI-tested.
- Semantic snapshots and render-backed waits prefer the optional `libghostty-vt` backend when it is available, then fall back to `ghostty-web`.
- Screenshots and WebM video depend on Playwright/Chromium and the `ghostty-web` backend.
- `run` is best for shell setup and command injection; it does not capture a child command's structured output or exit status.
- Apache-2.0, runs entirely locally, no account or SaaS.
Expand Down
6 changes: 3 additions & 3 deletions RELEASE.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,15 +9,15 @@ For per-release changes, see [`CHANGELOG.md`](./CHANGELOG.md). For release mecha
## Supported capabilities

- Reliable isolated session lifecycle management: `create`, `inspect`, `destroy`, and `gc` all work against isolated agent-tty homes.
- Renderer-backed screenshots, semantic snapshots, and WebM export for reviewer-visible proof artifacts.
- Renderer-backed screenshots, semantic snapshots, and WebM export for reviewer-visible proof artifacts; semantic operations prefer `libghostty-vt` when available, while visual artifacts use `ghostty-web`.
- The `run` command for robust in-session command execution without having to simulate long shell setup scripts as manual keystrokes.
- `doctor --json` with isolation-aware diagnostics for home resolution, renderer prerequisites, and screenshot viability.
- An append-only event log that remains the canonical replay/export source of truth.
- Schema-locked JSON envelopes across the public CLI surface.

## Explicitly out of scope

- Native renderer backends such as Ghostty native or kitty.
- Additional native renderer backends beyond the shipped `libghostty-vt` semantic renderer, such as kitty or platform terminal automation.
- Mouse input support.
- Remote or networked sessions.
- An MCP wrapper.
Expand All @@ -27,7 +27,7 @@ For per-release changes, see [`CHANGELOG.md`](./CHANGELOG.md). For release mecha

## Known limitations

- The renderer is the `ghostty-web` reference backend, not a native-terminal parity guarantee.
- Semantic operations may use `libghostty-vt`, but visual screenshots and WebM remain `ghostty-web` reference artifacts, not a native-terminal parity guarantee.
- `run` completion detection relies on shell-visible echo of an injected boundary marker.
- Screenshots and WebM export require Playwright/Chromium to be installed and discoverable.
- The reviewed LazyVim workflow currently assumes Neovim `>= 0.11.2` plus its usual prerequisites; older Neovim builds are out of contract for that scenario.
Expand Down
33 changes: 17 additions & 16 deletions design/ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ This design intentionally describes a **general product**, not a Mux-specific im

## Current shipped status

The current `0.3.x` line is centered on reliable, isolated, reviewable terminal and TUI automation. The shipped surface includes `run` for robust in-session command execution, renderer/browser-path handling that respects isolated-home workflows, and isolation-aware `doctor --json` diagnostics on top of lifecycle, snapshot, screenshot, and export work. Larger asks such as native renderers, mouse input, remote/network sessions, MCP wrapping, and broader semantic TUI automation remain intentionally deferred.
The current shipped line is centered on reliable, isolated, reviewable terminal and TUI automation. The shipped surface includes `run` for robust in-session command execution, split semantic/visual renderer defaults, renderer/browser-path handling that respects isolated-home workflows, and isolation-aware `doctor --json` diagnostics on top of lifecycle, snapshot, screenshot, and export work. Larger asks such as additional native renderers, mouse input, remote/network sessions, MCP wrapping, and broader semantic TUI automation remain intentionally deferred.

The repository now ships the first three milestones of this design plus Weeks 4–7 of CLI/artifact/lifecycle hardening, config/rendering/platform closeout, contract/introspection reconciliation, and Week 7 contract/doc ratification:

Expand Down Expand Up @@ -53,10 +53,10 @@ The recommended v1 shape is:
3. **TypeScript/Node** implementation
4. **One session-host process per terminal session**, not a global daemon
5. **`node-pty`** for PTY/process control
6. **`ghostty-web`** as the default reference renderer
7. **Playwright** as the screenshot / replay harness
6. **`libghostty-vt`** as the preferred semantic renderer when available, with **`ghostty-web`** as the visual reference renderer and semantic fallback
7. **Playwright** as the screenshot / replay-video harness
8. **Event-log-as-truth** architecture so screenshots, snapshots, and recordings can be replayed deterministically
9. **Renderer adapter interface** from day one so native renderers can be added later without redesigning the CLI
9. **Renderer adapter interface** from day one so renderer defaults can evolve without redesigning the CLI

## Why this shape

Expand Down Expand Up @@ -156,29 +156,30 @@ That lets v1:
- render videos from replay,
- and debug failures after the fact.

### 5) Reference renderer now, native renderers later
### 5) Semantic and visual renderers stay separated

V1 uses `ghostty-web` as a reference renderer for:
V1 uses two Ghostty-backed renderer paths by default:

- semantic snapshots,
- deterministic screenshots,
- deterministic video replay.
- `libghostty-vt` for semantic snapshots, screen hashes, and render-backed waits when the optional native package is available,
- `ghostty-web` for deterministic screenshots and deterministic video replay,
- `ghostty-web` again as the semantic fallback when native rendering is unavailable.

The architecture reserves native backends for later:
The architecture still reserves additional native backends for later:

- WezTerm-like native automation,
- Ghostty native automation,
- platform-specific terminal automation,
- platform-specific compatibility runs.

## Tiered truth model

`agent-tty` should treat terminal truth as layered rather than singular.

| Layer | Source of truth | What it answers |
| ---------------------- | --------------------------- | --------------------------------------------------------- |
| Execution truth | PTY + event log | What bytes, signals, and resize events actually occurred? |
| Reference visual truth | `ghostty-web` replay/render | What does a pinned reference renderer show? |
| Native visual truth | Future native adapter | What does a real platform terminal show? |
| Layer | Source of truth | What it answers |
| ---------------------- | ---------------------------------- | --------------------------------------------------------- |
| Execution truth | PTY + event log | What bytes, signals, and resize events actually occurred? |
| Semantic renderer truth | `libghostty-vt` or fallback replay | What terminal cells/text does Ghostty's VT state expose? |
| Reference visual truth | `ghostty-web` replay/render | What does a pinned reference renderer show? |
| Native visual truth | Future native adapter | What does a real platform terminal show? |

This prevents v1 from pretending reference rendering is identical to native platform rendering.

Expand Down
10 changes: 5 additions & 5 deletions docs/TROUBLESHOOTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,13 +35,14 @@ Affected commands usually include:

Check `doctor --json` for:

- `libghostty_vt_available` (preferred semantic renderer and dashboard)
- `playwright_available`
- `browser_cache_accessible`
- `browser_launch`
- `ghostty_web_available`
- `ghostty_web_available` (visual renderer and semantic fallback)
- `screenshot_viable`

If these fail in CI or a container, install Chromium during setup and make sure the cache is readable by the process running `agent-tty`.
If the browser-backed checks fail in CI or a container, install Chromium during setup and make sure the cache is readable by the process running `agent-tty`. If `libghostty_vt_available` is skipped or unavailable and no renderer is explicitly configured, semantic commands should fall back to `ghostty-web`; use `--renderer ghostty-web` to make that choice explicit. If you have `AGENT_TTY_RENDERER=libghostty-vt` or Home `config.json` sets `defaultRenderer` to `libghostty-vt`, clear that explicit configuration or override it with `ghostty-web` on machines without the optional native package.

## Isolated Homes

Expand Down Expand Up @@ -77,10 +78,9 @@ or install a GitHub Release tarball as described in [`INSTALL.md`](./INSTALL.md)

## Reference Rendering Caveat

`ghostty-web` is the reference renderer for snapshots, screenshots, and replay video.
It gives repeatable artifacts for review and automation, but it does not guarantee exact native-terminal pixel parity.
`libghostty-vt` is the preferred default for semantic snapshots and render-backed waits when the optional native package is available. `ghostty-web` remains the reference visual renderer for screenshots and replay video, and it is the automatic semantic fallback when native rendering is unavailable and no renderer override is set.

If a bug depends on a specific native terminal emulator, keep the `agent-tty` artifact as reference evidence and capture native-terminal evidence separately when needed.
These renderers give repeatable artifacts for review and automation, but they do not guarantee exact native-terminal pixel parity. If a bug depends on a specific native terminal emulator, keep the `agent-tty` artifact as reference evidence and capture native-terminal evidence separately when needed.

## Stray `%` at the End of Captured Output

Expand Down
6 changes: 4 additions & 2 deletions docs/USAGE.md
Original file line number Diff line number Diff line change
Expand Up @@ -190,8 +190,8 @@ The Wait Baseline fixes stale-match only. It does **not** fix echo-match: a `wai

## Screenshots And Recording Exports

Screenshots and WebM export use the `ghostty-web` reference renderer through Playwright/Chromium.
Run `doctor --json` first in new environments.
Screenshots and WebM export use the `ghostty-web` reference visual renderer through Playwright/Chromium.
Semantic `snapshot`, screen-hash, and render-backed `wait` paths prefer `libghostty-vt` when the optional native package is available and fall back to `ghostty-web` otherwise. Run `doctor --json` first in new environments.

```bash
agent-tty screenshot <session-id> --profile reference-dark --json
Expand All @@ -202,6 +202,8 @@ agent-tty record export <session-id> --format webm --timing accelerated --out ./

WebM export replays with recorded wall-clock timing by default. Pass `--timing accelerated` (idle gaps clamped to 400ms) or `--timing max-speed` for a time-compressed video.

Use `--renderer ghostty-web`, `AGENT_TTY_RENDERER=ghostty-web`, or Home `config.json` `{ "defaultRenderer": "ghostty-web" }` to force legacy all-browser rendering. Use `--renderer libghostty-vt` only when you intentionally want semantic and screenshot requests routed through the native backend; WebM requests still record `ghostty-web` as the actual video producer.

`ghostty-web` provides reference visual truth for reviewable artifacts; it does not promise exact pixel parity with native terminals.

## Isolation
Expand Down
4 changes: 2 additions & 2 deletions docs/prd/screen-hash/PRD.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Snapshot results and matched **Render Wait** results gain an optional **Screen H
- Add an optional **Screen Hash** field β€” a lowercase 64-character SHA-256 hex digest β€” to the snapshot result (both structured and text formats) and to the matched render-wait result.
- In scope: a **Batch Step** record for a matched **Render Wait** step also carries the **Screen Hash**, mirrored from that step's render-wait result, so a batch run exposes the same content identity per wait step that a standalone wait does.
- The **Screen Hash** is the SHA-256 of the canonical visible-text string: the visible lines joined by newline, exactly as the host's screen-stability compare and the text matcher already build it. The shared canonical-text **definition** β€” `visibleLines[].text` joined by `\n`, sourced only from the snapshot (never `backend.getVisibleText()` or `cells[]`) β€” is unchanged by adding the hash. Cursor position, text styles, and scrollback are excluded.
- Converging the two renderer backends on one canonical screen form (Phase 1) intentionally changes the **default** `ghostty-web` backend's stability and text-wait **comparand** on screens with grapheme clusters, interior blank-cell gaps, or non-ASCII trailing characters: the canonical form is exactly `rows` lines, each decoded with full grapheme clusters with blank/zero cells as `' '`, then right-trimmed of trailing ASCII spaces (`0x20`) only. This is a deliberate, narrow change pinned by characterization tests, not a free behavior-preserving add; on plain ASCII screens the comparand is unchanged.
- Converging the two renderer backends on one canonical screen form (Phase 1) intentionally changed the then-default `ghostty-web` backend's stability and text-wait **comparand** on screens with grapheme clusters, interior blank-cell gaps, or non-ASCII trailing characters: the canonical form is exactly `rows` lines, each decoded with full grapheme clusters with blank/zero cells as `' '`, then right-trimmed of trailing ASCII spaces (`0x20`) only. This was a deliberate, narrow change pinned by characterization tests, not a free behavior-preserving add; on plain ASCII screens the comparand was unchanged.
- Extract one shared canonical-screen-text helper and route the **Screen Hash**, the host **Screen Stability** compare, and the text **Render Wait** matcher through it, so the three share a single definition and cannot diverge.
- The hash is keyed on whether a result holds an **observed** **Semantic Snapshot**, not on whether the wait matched. A result carries the **Screen Hash** of the snapshot it observed: a matched live wait, a snapshot capture, and the offline host-unreachable fallback that still observed a latest snapshot (even when it returns `matched: false` because the **Screen Stability** duration could not be proven offline). The hash is omitted only when no snapshot was observed: a live wait that times out, a consecutive-failure giveup, or a replay error throw.
- Do not surface the **Screen Hash** on inspection or any path that does not already render a **Semantic Snapshot**; computing it must never force a renderer bootstrap that would not otherwise happen.
Expand All @@ -55,7 +55,7 @@ Good tests assert external behavior, not implementation details.
- A styled or per-cell hash. Transient style churn would make such a hash flap; the **Screen Hash** is text-content identity only.
- Pixel-level identity, and any **Screen Hash** on the **Screenshot Result**. A **Screenshot Result** carries only its pixel `sha256`; the content hash lives on the snapshot and wait results. The **Screen Hash** is the semantic counterpart to the pixel digest and the two are not interchangeable.
- New wait semantics built on the hash (for example, "wait until the screen content changes"). v1 only exposes the field; any hash-driven wait is future scope.
- Any change to the screen-stability behavior **beyond** the Phase 1 renderer-convergence change described in the Implementation Decisions. The canonical-text definition and the shared single-source unify are behavior-preserving; the only intended behavior change is the default `ghostty-web` backend's comparand on grapheme / interior-gap / non-ASCII-trailing screens, pinned by characterization tests. No new wait semantics are added.
- Any change to the screen-stability behavior **beyond** the Phase 1 renderer-convergence change described in the Implementation Decisions. The canonical-text definition and the shared single-source unify are behavior-preserving; the only intended behavior change was the then-default `ghostty-web` backend's comparand on grapheme / interior-gap / non-ASCII-trailing screens, pinned by characterization tests. No new wait semantics are added.

## Further Notes

Expand Down
Loading
Loading