Skip to content

feat: render inline images via Kitty graphics protocol#77

Open
trieloff wants to merge 3 commits into
vercel-labs:mainfrom
trieloff:feat/kitty-graphics-protocol
Open

feat: render inline images via Kitty graphics protocol#77
trieloff wants to merge 3 commits into
vercel-labs:mainfrom
trieloff:feat/kitty-graphics-protocol

Conversation

@trieloff
Copy link
Copy Markdown

@trieloff trieloff commented May 21, 2026

Screenshot 2026-05-21 at 10 48 25

Summary

Closes #60.

Adds inline image rendering to wterm via the Kitty terminal graphics protocol. Everything lives in @wterm/dom — no changes to the Zig core or libghostty.

  • A streaming filter in @wterm/dom intercepts APC \x1b_G...\x1b\\ sequences before they reach the VT core, parses the key=value control + base64 payload (with chunked m=1/m=0 accumulation), and emits image events.
  • An overlay layer renders each image as an absolutely-positioned <img> inside the term-grid container. Because the existing renderer inserts new scrollback rows above the grid, content's pixel offset is invariant under scroll-up — so an image stays pinned to the cells where it was placed, with no per-row anchor bookkeeping.
  • Default T/p cursor advancement is implemented by writing newlines to the core based on either the r= control or the PNG IHDR dimensions (via a tiny header reader).
  • Enabled by default. Opt out with new WTerm(el, { images: false }). The same flag is threaded through @wterm/react and @wterm/vue.

Scope (intentional MVP)

In:

  • Actions: t (transmit), T (transmit + display), p (put placement), d (delete)
  • Format: PNG (f=100)
  • Transport: direct base64 (t=d, default), including chunked m=1/m=0
  • i= / I= identifiers, p= placement IDs, c=/r= cell sizing, z= z-index, X=/Y= cell offsets, C=1 cursor-stay opt-out

Out (follow-ups):

  • Raw RGB/RGBA frames (f=24 / f=32)
  • File / shared-memory / temp-file transports (t=f/t=s/t=t)
  • Virtual placement via Unicode placeholders
  • Animations
  • Sixel
  • iTerm2 inline images

Each scope cut is called out in the docs so users know what to expect.

What changed

  • packages/@wterm/dom/src/kitty-graphics.ts (new) — streaming APC parser/filter
  • packages/@wterm/dom/src/image-overlay.ts (new) — DOM overlay with store / place / delete
  • packages/@wterm/dom/src/png.ts (new) — PNG IHDR dimension reader
  • packages/@wterm/dom/src/wterm.ts — wired filter into write(), added images option
  • packages/@wterm/dom/src/renderer.tssetup() now preserves non-row children so the overlay survives ResizeObserver-driven resizes
  • packages/@wterm/dom/src/terminal.css.term-grid { position: relative } + .term-image-layer / .term-image rules
  • packages/@wterm/{react,vue}/src/Terminal.* — pass-through images prop
  • examples/kitty-images/ (new) — self-contained Vite demo that draws a PNG on a <canvas> and transmits it via a chunked APC
  • Docs: root README, @wterm/dom README, apps/docs/src/app/configuration/page.mdx, apps/docs/src/app/api-reference/page.mdx

Test plan

  • 81 unit tests pass in @wterm/dom (12 new tests in kitty-graphics.test.ts covering plain text passthrough, CSI passthrough, non-Kitty APC passthrough, single sequence, BEL terminator, chunked split-across-writes, byte-by-byte streaming, delete action, ESC followed by non-_, _ followed by non-G, text-run coalescing, and reset).
  • All packages type-check (pnpm -r type-check)
  • Existing tests pass in @wterm/core, @wterm/react, @wterm/vue, @wterm/just-bash, @wterm/markdown
  • Verified end-to-end in a headless Chromium: PNG renders inline aligned to the cell grid, surrounding text wraps below the image, no console errors. Screenshot attached locally as /tmp/wterm-kitty.png.

How to demo

pnpm install
pnpm --filter kitty-images-example dev
# then open the URL portless assigns (or run `vite preview` if you don't have portless)

A generated 200×100 PNG (gradient + "wterm 🚀" text) is transmitted via the protocol's chunked APC transport.

Intercepts APC `\x1b_G...\x1b\\` sequences from input streams in
`@wterm/dom`, decodes the base64 PNG payload (including chunked
`m=1`/`m=0` transfers), and renders the image as an absolutely-
positioned `<img>` overlay inside the cell grid. The overlay layer
stays aligned with its anchor row across scrollback growth because
the renderer inserts new scrollback rows above existing grid rows,
keeping content's pixel position invariant.

Supports actions `t`/`T`/`p`/`d` with PNG format (`f=100`) over the
direct base64 transport, plus `c=`/`r=` cell-fit sizing, `i=`/`I=`
identification, and the `C=1` no-cursor-movement opt-out. The
default `T`/`p` cursor advance is implemented by writing newlines
to the core based on the image's row count (taken from `r=` or
parsed from the PNG IHDR).

Not in scope: raw RGB/RGBA frames, file/shared-memory transports,
virtual placement via Unicode placeholders, animations, Sixel, and
iTerm2 inline images.

Opt out with `new WTerm(el, { images: false })`. The same option
is threaded through `@wterm/react` and `@wterm/vue`.

Closes vercel-labs#60

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Lars Trieloff <lars@trieloff.net>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 21, 2026

@claude is attempting to deploy a commit to the Vercel Labs Team on Vercel.

A member of the Team first needs to authorize it.

Copy link
Copy Markdown
Author

@trieloff trieloff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't reformat lines that you didn't touch.

Comment thread apps/docs/src/app/api-reference/page.mdx Outdated
Comment thread packages/@wterm/dom/src/terminal.css Outdated
Comment thread packages/@wterm/dom/README.md Outdated
Restores the original compact formatting on lines the Kitty graphics
feature does not need to touch, addressing PR vercel-labs#77 review feedback:

- packages/@wterm/dom/src/terminal.css: --term-font-family back to a
  single line with single quotes; @Keyframes cursor-blink back to
  2-line compact form. Keeps position: relative on .term-grid plus
  the new .term-image-layer / .term-image rules.
- packages/@wterm/dom/README.md: Options and Methods tables restored
  to compact pipe-table style; new images row added in the same style;
  Inline images section preserved.
- apps/docs/src/app/api-reference/page.mdx: pre-existing <td><code>
  cells back to one line; new images row added in the same compact
  JSX style.
- README.md: Packages table restored to compact form; Inline images
  Features bullet preserved.
@trieloff trieloff marked this pull request as ready for review May 21, 2026 19:03
Copilot AI review requested due to automatic review settings May 21, 2026 19:03
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds inline PNG image rendering support via the Kitty terminal graphics protocol across DOM/React/Vue packages, with docs and an example app.

Changes:

  • Introduces a streaming Kitty APC parser and an <img> overlay layer in @wterm/dom, wired into WTerm.write().
  • Adds an images option/prop (default true) to Dom/React/Vue and updates docs/API reference accordingly.
  • Adds tests for the Kitty parser and an example Vite app demonstrating chunked image transfer.

Reviewed changes

Copilot reviewed 21 out of 23 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
packages/@wterm/vue/src/Terminal.ts Adds images prop and forwards it into WTerm options.
packages/@wterm/react/src/Terminal.tsx Adds images prop and forwards it into WTerm options.
packages/@wterm/dom/src/wterm.ts Wires Kitty filtering + overlay into write path; cursor advance for images; cleanup on destroy.
packages/@wterm/dom/src/terminal.css Adds overlay/image layer styling and makes grid position: relative.
packages/@wterm/dom/src/renderer.ts Preserves non-row children (e.g., image overlay) across renderer setup.
packages/@wterm/dom/src/png.ts Adds lightweight PNG IHDR dimension extraction helper.
packages/@wterm/dom/src/kitty-graphics.ts Adds streaming Kitty graphics APC parser/filter.
packages/@wterm/dom/src/index.ts Exports new Kitty graphics and overlay APIs from @wterm/dom.
packages/@wterm/dom/src/image-overlay.ts Adds <img> overlay implementation for rendering placements.
packages/@wterm/dom/src/tests/wterm.test.ts Updates write-path expectations + adds coverage for images: false.
packages/@wterm/dom/src/tests/kitty-graphics.test.ts Adds unit tests for the Kitty streaming filter, chunking, reset.
packages/@wterm/dom/README.md Documents the new images option and usage example.
examples/kitty-images/wterm-dom.d.ts Declares CSS module import for the example.
examples/kitty-images/vite.config.ts Vite config for the new example app.
examples/kitty-images/tsconfig.json TypeScript configuration for the example app.
examples/kitty-images/src/main.ts Demo app that generates a PNG and sends chunked Kitty APC sequences.
examples/kitty-images/package.json Example app package metadata and scripts.
examples/kitty-images/index.html HTML entry for the example app.
examples/kitty-images/README.md Example app instructions and explanation.
apps/docs/src/app/configuration/page.mdx Adds images to shared options + new inline images section.
apps/docs/src/app/api-reference/page.mdx Adds images option to the API reference table.
README.md Mentions inline images as a top-level feature.
Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +92 to +103
export class KittyGraphicsFilter {
private state: State = State.Idle;
/** Buffered Kitty APC payload (bytes between `\x1b_G` and the terminator). */
private apcBuf: number[] = [];
/** Pending chunked transfers keyed by image id (i=) or image number (-I=). */
private pendingChunks = new Map<string, PendingChunk>();

/**
* Push a chunk of bytes through the filter. Returns the ordered list of
* pass-through text segments and completed graphics events.
*/
push(input: Uint8Array): StreamEvent[] {
this.pendingChunks.clear();
}

private _completeApc(events: StreamEvent[]): void {
Comment on lines +220 to +221
const control = parseControl(decodeAscii(ctrlBytes));
const payloadB64 = decodeAscii(payloadBytes);
const more = control.m === 1;
const key = chunkKey(control);

if (more || this.pendingChunks.has(key)) {
Comment on lines +249 to +263
data: decodeBase64(completed.payload),
},
});
return;
}

events.push({
type: "graphics",
event: {
control,
data: decodeBase64(payloadB64),
},
});
}
}
Comment on lines +121 to +125
const blob = new Blob([new Uint8Array(stored.data)], { type: "image/png" });
img.src = URL.createObjectURL(blob);
img.addEventListener("load", () => URL.revokeObjectURL(img.src), {
once: true,
});
Comment on lines +188 to +192
function imageKey(control: KittyControl): string | null {
if (typeof control.i === "number") return `i:${control.i}`;
if (typeof control.I === "number") return `I:${control.I}`;
return "i:0";
}
Comment thread packages/@wterm/dom/README.md Outdated
When `images: true` (default), wterm intercepts Kitty graphics protocol APC sequences and renders the transmitted PNG as an absolutely-positioned `<img>` overlay aligned to the cell grid. Supports inline base64 transfers (`f=100`) and multi-chunk `m=1`/`m=0` payloads. Actions: `t` (transmit), `T` (transmit + display), `p` (put placement), `d` (delete).

```ts
const png = await fetch("/icon.png").then((r) => r.bytes());
const encoded = new TextEncoder().encode("hello");
expect(mockBridge.writeRaw).toHaveBeenCalledWith(encoded);
});

Comment on lines +178 to +183
it("falls back to writeString when images are disabled", async () => {
const term = new WTerm(element, { autoResize: false, images: false });
await term.init();
term.write("hello");
expect(mockBridge.writeString).toHaveBeenCalledWith("hello");
});
Comment thread packages/@wterm/dom/src/image-overlay.ts Outdated
…test

Addresses automated review feedback on PR vercel-labs#77 (Copilot + Vercel VADE):

image-overlay.ts:
- Track objectUrl per Placement; revoke on replace, _delete, and clear
- Add error listener alongside load to revoke on decode failures
- imageKey() now returns string (always); remove three dead null guards

kitty-graphics.ts:
- Wrap decodeBase64 in try/catch; silently drop malformed APCs
- Cap pendingChunks at MAX_PENDING_CHUNKS=8 and MAX_PENDING_BASE64_BYTES=32MiB
  with oldest-entry eviction; track running pendingBytes
- Replace per-byte decodeAscii loop with TextDecoder('latin1')

dom/README.md:
- Use new Uint8Array(await r.arrayBuffer()) in the inline-images snippet
  instead of non-portable Response.bytes()

Tests:
- wterm.test.ts: new integration test asserts the bridge never sees ESC
  bytes when images are enabled and a Kitty APC is in the input stream
- kitty-graphics.test.ts: cover invalid-base64 silent drop and
  chunk-count cap eviction

84/84 @wterm/dom tests pass; pnpm -r type-check clean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: Support terminal inline image graphics protocols

3 participants