Skip to content

feat(dashboard): unified MIME-bundle rail for rich exec outputs (sortable DataFrames + plots)#591

Open
Andrew Gazelka (andrewgazelka) wants to merge 4 commits into
mainfrom
codex/exec-df-html
Open

feat(dashboard): unified MIME-bundle rail for rich exec outputs (sortable DataFrames + plots)#591
Andrew Gazelka (andrewgazelka) wants to merge 4 commits into
mainfrom
codex/exec-df-html

Conversation

@andrewgazelka
Copy link
Copy Markdown
Member

@andrewgazelka Andrew Gazelka (andrewgazelka) commented Jun 2, 2026

Exec panes had two ad-hoc rich-output rails: images (model-only) and, as first drafted here, html (dashboard-only). A third rich type would need yet another hardcoded field through worker → wire → hub → frontend. This collapses them into one extensible outputs rail modeled on Jupyter's display-data protocol.

What you get

  • DataFrames render as a sortable HTML grid for the operator (click-to-sort header, sticky header, dtype labels, numeric right-align, light+dark), built from the frame's own rows so it ignores any pl.Config cap.
  • Plots/images now render on the dashboard too — matplotlib/PIL images were previously sent to the model but never shown to the operator.
  • A single cell that display()s a frame and draws a plot produces two ordered bundles, rendered as table + image (screenshot in thread).

Design

outputs is an ordered list of MIME bundles (mime -> data), one per displayed object / eval result / open figure. The worker builds each from the rich-display protocol it already used for images: text/html (df grid) + text/plain (repr), image/* base64, or whatever _repr_mimebundle_/_repr_*_ offers. The frontend has a small renderer registry (DisplayOutputs): text/html → sandboxed allow-scripts iframe, image/* → img, text → pre. Adding a new rich type is frontend-only.

Audience split: worker_response_content forwards only image/* + the captured text to the model; rich tables are operator-only, never context. To keep the agent's captured text compact, the session applies a one-time compact pl.Config (20 rows/cols) the first time polars is used — and only if the user hasn't set those knobs, so their config always wins.

Validation

  • cargo test dashboard-core / ix-mcp / dashboard — outputs bundle round-trips wire + Loro doc
  • svelte-check clean
  • Real browser: numeric sort verified on header click; table + matplotlib plot rendered together from a real worker run (light+dark)
  • nix run .#lint green

Linux-only clippy runs in CI.

Made with AI (Claude Opus 4.8).

Note

Add rich MIME-bundle display rail for exec outputs with sortable DataFrame tables

  • Adds a DisplayOutputs Svelte component that renders rich MIME bundles (HTML, images, plain text) in a sandboxed iframe or as image/text elements, shown in exec panes and feed views.
  • Extends ExecView in pane.rs with an outputs field (Vec<BTreeMap<String, String>>) carrying ordered MIME bundles, serialized to JSON and published via hub.rs.
  • Rewrites python_worker.py display collection: builds per-object MIME bundles with HTML, image, and plain-text representations; generates self-contained sortable HTML table documents for polars/pandas DataFrames (capped at MAX_HTML_ROWS).
  • Replaces the previous flat images array in worker responses with an outputs bundle list; MCP tool results now extract image blocks from these bundles instead of a separate images field.
  • Behavioral Change: worker_response_content in main.rs no longer reads response.images; workers must emit outputs bundles or images will not appear in tool results.

Macroscope summarized cf6b0fb.

The MCP Python exec board showed every result as a monospace ASCII table:
the same text went to the human dashboard and the model's tool result, so a
wide polars/pandas frame was both hard to read for the operator and a context
sink for the agent.

Split the two audiences. The worker now collects a self-contained HTML document
for each displayed DataFrame and eval result (reusing the Jupyter rich-display
path already used for images): a click-to-sort grid with sticky header, dtype
labels, numeric right-align, and light/dark styling, built from the frame's own
rows so it is independent of any pl.Config cap. These ride a new html channel
on the exec pane (ExecView.html -> hub text -> sandboxed iframe) and reach only
the dashboard; worker_response_content never forwards them, so the agent's
context is untouched. To keep the agent's captured text compact, the session
applies a one-time compact pl.Config (20 rows/cols) the first time polars is
used, so an existing print(df)/report() stays terse while the human still gets
the full interactive table via display(df).

Made with AI (Claude Opus 4.8).

Co-authored-by: Andrew Gazelka <andrew@ix.dev>
_apply_polars_compact mutates the process-global pl.Config, so re-applying the
compact default would silently undo a pl.Config.set_tbl_rows(...) the user ran in
an earlier cell (the once-flag only flips when polars is already imported at a
cell's start, missing the cell that imports + configures polars together).

Apply the compact repr default only when the user has not set any repr knob
(POLARS_FMT_MAX_ROWS/COLS/STR_LEN via Config.state(if_set=True)), so their own
config always wins. Found in adversarial review (C1).

Made with AI (Claude Opus 4.8).
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e98772dd1b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "Codex (@codex) review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".

<div class="exec-empty">{running ? '· running…' : '· no output'}</div>
{:else}
<!-- Rich tables lead: a DataFrame's sortable grid is the answer when present. -->
<HtmlTables docs={htmlDocs} {expanded} />
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Regenerate the embedded dashboard HTML

This adds the table renderer to the Svelte sources, but the production dashboard served by Rust is still the committed packages/dashboard-core/src/dashboard/dashboard.html (server.rs embeds it with include_str!). I checked that committed HTML with rg and it does not contain HtmlTables, exec-html, or parseExecHtml; packages/dashboard/default.nix also has a dashboardInSync diff check requiring the generated file to match the site build. As a result, packaged/embedded dashboards will not render the new DataFrame tables and the sync check will fail until the rebuilt index.html is copied into dashboard.html.

Useful? React with 👍 / 👎.

Comment on lines +500 to +503
head = "".join(
f'<th class="{"num" if numeric[i] else "txt"}" data-i="{i}">'
f"{_html_escape(name)}<span class=dt>{_html_escape(dtypes[i])}</span></th>"
for i, name in enumerate(columns)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Cap rendered table columns

When a pandas/polars frame is very wide, this renderer still emits every column into the HTML document (and the row loop below emits every cell for those columns); only the number of rows is capped. Displaying a 500-row frame with thousands of columns can therefore serialize megabytes of srcdoc into the worker response/Loro snapshot and hang or flood the dashboard, even though the text repr is intentionally compacted. Add a column or byte cap before constructing the table document.

Useful? React with 👍 / 👎.

Comment on lines +225 to +227
pl.Config.set_tbl_rows(20)
pl.Config.set_tbl_cols(20)
pl.Config.set_fmt_str_lengths(50)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve user Polars display settings

In a persistent Python session, if a user imports polars and sets pl.Config in that first cell, _apply_polars_compact has not marked itself applied yet because polars was absent at the start of the call; the next call then overwrites the user's persistent table rows/columns/string-length settings with these hard-coded values. That makes later print(df)/repr output differ from the user's configured session state, despite the comment saying explicit user config should win.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI review found issues in this pull request.

Verdict: patch is incorrect
Confidence: 0.88

The feature is functional for small outputs, but the new rich-HTML path bypasses the intended resource bounds in common cases and can still overwhelm the worker, dashboard, or recording stream.

  • P1 packages/mcp/src/python_worker.py:521 Rich table HTML is still effectively unbounded
  • P2 packages/mcp/src/python_worker.py:210 MAX_HTML is applied after rendering every candidate

Comment on lines +521 to +536
body_rows = []
for row in rows:
cells = "".join(
f'<td class="{"num" if numeric[i] else "txt"}">{_html_escape(_cell(v))}</td>'
for i, v in enumerate(row)
)
body_rows.append(f"<tr>{cells}</tr>")
body = "".join(body_rows)
shown = len(rows)
caption = f"{height} × {len(columns)}"
if shown < height:
caption += f" · showing first {shown}"
return _wrap_html(
f'<table><caption>{_html_escape(caption)}</caption>'
f"<thead><tr>{head}</tr></thead><tbody>{body}</tbody></table>{_SORT_JS}"
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Rich table HTML is still effectively unbounded

The row count is capped, but each rendered table still emits every column and the full string value of every cell into a single HTML document. A 500-row frame with thousands of columns, or even one column containing very large strings, can produce a huge JSON-RPC response and Loro text field despite the new caps, blocking the worker/dashboard and bloating recordings. Add column, cell-length, and total HTML-byte limits before returning these docs.

Comment thread packages/mcp/src/python_worker.py Outdated
Comment on lines +210 to +211
docs = [doc for obj in candidates if (doc := _object_html_doc(obj)) is not None]
return docs[:MAX_HTML]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge MAX_HTML is applied after rendering every candidate

This list comprehension renders rich HTML for every displayed object before slicing to MAX_HTML. A cell that calls display() on many DataFrames still materializes and wraps all of them, so the cap does not protect worker latency or memory. Stop iterating once MAX_HTML docs have been collected, and treat render failures like image collection does so one bad rich repr cannot fail the whole response.

Suggested change
docs = [doc for obj in candidates if (doc := _object_html_doc(obj)) is not None]
return docs[:MAX_HTML]
docs: list[str] = []
for obj in candidates:
if len(docs) >= MAX_HTML:
break
try:
doc = _object_html_doc(obj)
except Exception:
continue
if doc is not None:
docs.append(doc)
return docs

Replace the two ad-hoc rich-output rails on the exec pane (images, which went
only to the model, and the just-added html, which went only to the dashboard)
with a single extensible `outputs` rail modeled on Jupyter's display-data
protocol: an ordered list of MIME bundles (mime -> data), one per displayed
object / eval result / open figure.

The worker builds each bundle from the rich-display protocol it already used for
images (text/html for a DataFrame's sortable grid + text/plain repr; image/* for
plots and PIL images; whatever _repr_mimebundle_/_repr_*_ offers otherwise). The
frontend gains a small renderer registry (DisplayOutputs) that picks the richest
representation per bundle: text/html -> sandboxed iframe, image/* -> img, text ->
pre. worker_response_content extracts only image/* and the captured text for the
model, so rich tables stay operator-only.

Why: a third rich type (a chart, LaTeX, audio) would otherwise need yet another
hardcoded field threaded through worker, wire, hub, and frontend. One bundle rail
makes a new type frontend-only. It also fixes a latent gap: matplotlib/PIL images
were sent to the model but never rendered on the dashboard; now every output
renders for the operator and the right parts reach the model.

Made with AI (Claude Opus 4.8).
@andrewgazelka Andrew Gazelka (andrewgazelka) changed the title feat(dashboard): render DataFrame results as sortable HTML tables feat(dashboard): unified MIME-bundle rail for rich exec outputs (sortable DataFrames + plots) Jun 2, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI review found issues in this pull request.

Verdict: patch is incorrect
Confidence: 0.90

The feature is not actually shipped because the embedded dashboard artifact was not regenerated, and the new DataFrame HTML path can still produce unbounded dashboard payloads despite the stated row cap.

  • P1 packages/dashboard-core/site/src/components/ExecBody.svelte:59 Embedded dashboard HTML is stale
  • P2 packages/mcp/src/python_worker.py:507 DataFrame HTML output is not bounded by columns or bytes

Comment on lines +59 to +60
<!-- Rich outputs lead: a DataFrame's sortable grid or a plot is the answer. -->
<DisplayOutputs {outputs} {expanded} />
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Embedded dashboard HTML is stale

This wires the new rich-output renderer into the Svelte source, but the PR does not update packages/dashboard-core/src/dashboard/dashboard.html. The dashboard server embeds that committed HTML via include_str!, and the package has a sync test that diffs it against a fresh site build, so the shipped dashboard will still serve the old JS with no outputs renderer and the in-sync check will fail. Rebuild the site and commit the regenerated dashboard.html with this source change.

Comment on lines +507 to +518
if module == "polars" and hasattr(obj, "columns") and hasattr(obj, "rows"):
columns = list(obj.columns)
dtypes = [str(t) for t in obj.dtypes]
height = int(obj.height)
rows = obj.head(MAX_HTML_ROWS).rows()
return {"columns": columns, "dtypes": dtypes, "rows": rows, "height": height}
if module == "pandas" and hasattr(obj, "columns") and hasattr(obj, "itertuples"):
columns = [str(c) for c in obj.columns]
dtypes = [str(t) for t in obj.dtypes]
height = int(len(obj))
rows = list(obj.head(MAX_HTML_ROWS).itertuples(index=False, name=None))
return {"columns": columns, "dtypes": dtypes, "rows": rows, "height": height}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge DataFrame HTML output is not bounded by columns or bytes

The table preview caps only the number of rows, then materializes every column and stringifies each cell into one HTML document. A wide frame, or a frame with large string/blob cells, can still generate megabytes or more of text/html, bypassing the existing stdout/result capture limits and then storing that payload in the worker response and Loro document. Add a column/cell/output-byte cap before rendering the bundle so one rich display cannot hang or bloat the dashboard.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 3, 2026

Blast radius

212 of 1002 checks would rebuild between base f1127e76 and head 6c3a718d.

changed checks
  • eval
  • image-development-base
  • image-kernel-dev
  • image-minecraft
  • image-minecraft-bedrock
  • image-minecraft-status
  • image-minecraft_1.21.11-fabric
  • image-minecraft_1.21.11-paper
  • image-minecraft_26.1.2-fabric
  • image-minecraft_26.1.2-paper
  • image-minecraft_26w17a-fabric
  • image-minestom
  • image-neovim-ci
  • image-remote-desktop
  • image-symphony-codex
  • image-test-cluster-bootstrap
  • lint
  • rust-agents-md-cargoMachete
  • rust-agents-md-clippy
  • rust-agents-md-unusedCrateDependencies
  • rust-ast-merge-ast-cargoMachete
  • rust-ast-merge-ast-clippy
  • rust-ast-merge-ast-unusedCrateDependencies
  • rust-ast-merge-cargoMachete
  • rust-ast-merge-clippy
  • rust-ast-merge-diff-cargoMachete
  • rust-ast-merge-diff-clippy
  • rust-ast-merge-diff-unusedCrateDependencies
  • rust-ast-merge-git-cargoMachete
  • rust-ast-merge-git-clippy
  • rust-ast-merge-git-unusedCrateDependencies
  • rust-ast-merge-langs-cargoMachete
  • rust-ast-merge-langs-clippy
  • rust-ast-merge-langs-unusedCrateDependencies
  • rust-ast-merge-matcher-cargoMachete
  • rust-ast-merge-matcher-clippy
  • rust-ast-merge-matcher-unusedCrateDependencies
  • rust-ast-merge-unusedCrateDependencies
  • rust-clone-cargoMachete
  • rust-clone-clippy
  • rust-clone-detect-cargoMachete
  • rust-clone-detect-clippy
  • rust-clone-detect-unusedCrateDependencies
  • rust-clone-hash-cargoMachete
  • rust-clone-hash-clippy
  • rust-clone-hash-unusedCrateDependencies
  • rust-clone-pragma-cargoMachete
  • rust-clone-pragma-clippy
  • rust-clone-pragma-unusedCrateDependencies
  • rust-clone-scanner-cargoMachete
  • rust-clone-scanner-clippy
  • rust-clone-scanner-unusedCrateDependencies
  • rust-clone-unusedCrateDependencies
  • rust-code-highlight-cargoMachete
  • rust-code-highlight-clippy
  • rust-code-highlight-unusedCrateDependencies
  • rust-dag-runner-cargoMachete
  • rust-dag-runner-clippy
  • rust-dag-runner-unusedCrateDependencies
  • rust-dashboard-cargoMachete
  • rust-dashboard-clippy
  • rust-dashboard-dashboard-all
  • rust-dashboard-dashboardInSync
  • rust-dashboard-package
  • rust-dashboard-site
  • rust-dashboard-unusedCrateDependencies
  • rust-file-language-cargoMachete
  • rust-file-language-clippy
  • rust-file-language-unusedCrateDependencies
  • rust-file-search-cargoMachete
  • rust-file-search-clippy
  • rust-file-search-unusedCrateDependencies
  • rust-git-log-pretty-cargoMachete
  • rust-git-log-pretty-clippy
  • rust-git-log-pretty-unusedCrateDependencies
  • rust-github-avatar-cargoMachete
  • rust-github-avatar-clippy
  • rust-github-avatar-unusedCrateDependencies
  • rust-indexbench-cargoMachete
  • rust-indexbench-clippy
  • rust-indexbench-unusedCrateDependencies
  • rust-indexer-cargoMachete
  • rust-indexer-clippy
  • rust-indexer-unusedCrateDependencies
  • rust-ix-dev-diagnose-cargoMachete
  • rust-ix-dev-diagnose-clippy
  • rust-ix-dev-diagnose-unusedCrateDependencies
  • rust-ix-vt-cargoMachete
  • rust-ix-vt-clippy
  • rust-ix-vt-unusedCrateDependencies
  • rust-kitty-cargoMachete
  • rust-kitty-clippy
  • rust-kitty-unusedCrateDependencies
  • rust-mcp-cargoMachete
  • rust-mcp-clippy
  • rust-mcp-dataLibsBundled
  • rust-mcp-gmailLibsBundled
  • rust-mcp-ix-mcp-all
  • rust-mcp-ix-mcp-tests-call_timeout_clamps_absurd_budgets
  • rust-mcp-ix-mcp-tests-cancelled_call_returns_and_session_recovers
  • rust-mcp-ix-mcp-tests-create_session_rejects_unknown_fields
  • rust-mcp-ix-mcp-tests-hung_call_times_out_and_session_recovers
  • rust-mcp-ix-mcp-tests-subprocess_output_is_captured_at_fd_level
  • rust-mcp-ix-mcp-tests-synchronous_expressions_still_evaluate
  • rust-mcp-ix-mcp-tests-top_level_await_persists_async_state_across_calls
  • rust-mcp-ix-mcp-tests-worker_stderr_burst_cannot_block_protocol
  • rust-mcp-package
  • rust-mcp-replDefault
  • rust-mcp-searchBundled
  • rust-mcp-sessionSubprocessStdin
  • rust-mcp-sessionVenv
  • rust-mcp-tuiBundled
  • rust-mcp-unusedCrateDependencies
  • rust-minecraft-nbt-cargoMachete
  • rust-minecraft-nbt-clippy
  • rust-minecraft-nbt-unusedCrateDependencies
  • rust-minecraft-sound-cargoMachete
  • rust-minecraft-sound-clippy
  • rust-minecraft-sound-unusedCrateDependencies
  • rust-minecraft-sync-managed-cargoMachete
  • rust-minecraft-sync-managed-clippy
  • rust-minecraft-sync-managed-unusedCrateDependencies
  • rust-mixedbread-cargoMachete
  • rust-mixedbread-clippy
  • rust-mixedbread-unusedCrateDependencies
  • rust-nix-web-monitor-cargoMachete
  • rust-nix-web-monitor-clippy
  • rust-nix-web-monitor-unusedCrateDependencies
  • rust-oci-image-builder-cargoMachete
  • rust-oci-image-builder-clippy
  • rust-oci-image-builder-unusedCrateDependencies
  • rust-progress-style-cargoMachete
  • rust-progress-style-clippy
  • rust-progress-style-unusedCrateDependencies
  • rust-reel-cargoMachete
  • rust-reel-clippy
  • rust-reel-package
  • rust-reel-printsHelp
  • rust-reel-reel-all
  • rust-reel-unusedCrateDependencies
  • rust-resource-monitor-stats-writer-cargoMachete
  • rust-resource-monitor-stats-writer-clippy
  • rust-resource-monitor-stats-writer-unusedCrateDependencies
  • rust-screencast-ingest-cargoMachete
  • rust-screencast-ingest-clippy
  • rust-screencast-ingest-unusedCrateDependencies
  • rust-search-cargoMachete
  • rust-search-clippy
  • rust-search-core-cargoMachete
  • rust-search-core-clippy
  • rust-search-core-unusedCrateDependencies
  • rust-search-unusedCrateDependencies
  • rust-sink-mixedbread-cargoMachete
  • rust-sink-mixedbread-clippy
  • rust-sink-mixedbread-unusedCrateDependencies
  • rust-sink-parquet-cargoMachete
  • rust-sink-parquet-clippy
  • rust-sink-parquet-unusedCrateDependencies
  • rust-source-atuin-cargoMachete
  • rust-source-atuin-clippy
  • rust-source-atuin-unusedCrateDependencies
  • rust-source-claude-cargoMachete
  • rust-source-claude-clippy
  • rust-source-claude-unusedCrateDependencies
  • rust-source-codex-cargoMachete
  • rust-source-codex-clippy
  • rust-source-codex-unusedCrateDependencies
  • rust-source-debug-cargoMachete
  • rust-source-debug-clippy
  • rust-source-debug-unusedCrateDependencies
  • rust-source-git-cargoMachete
  • rust-source-git-clippy
  • rust-source-git-unusedCrateDependencies
  • rust-source-linear-cargoMachete
  • rust-source-linear-clippy
  • rust-source-linear-unusedCrateDependencies
  • rust-source-meta-cargoMachete
  • rust-source-meta-clippy
  • rust-source-meta-unusedCrateDependencies
  • rust-source-slack-cargoMachete
  • rust-source-slack-clippy
  • rust-source-slack-unusedCrateDependencies
  • rust-tap-cargoMachete
  • rust-tap-clippy
  • rust-tap-protocol-cargoMachete
  • rust-tap-protocol-clippy
  • rust-tap-protocol-unusedCrateDependencies
  • rust-tap-pty-cargoMachete
  • rust-tap-pty-clippy
  • rust-tap-pty-unusedCrateDependencies
  • rust-tap-session-all
  • rust-tap-session-multiplayer_shares_output_and_sizes_to_smallest_client
  • rust-tap-session-resize_while_attached_reaches_the_session
  • rust-tap-session-second_attach_resyncs_a_full_screen_tui
  • rust-tap-session-session_round_trips_input_and_serves_scrollback
  • rust-tap-tap-all
  • rust-tap-tap-config-tests-defaults_are_sane
  • rust-tap-tap-config-tests-matches_kitty_csi_u
  • rust-tap-tap-config-tests-matches_legacy_sequences
  • rust-tap-tap-config-tests-parses_alt_and_ctrl
  • rust-tap-tap-editor-tests-builds_position_args_per_editor
  • rust-tap-tap-editor-tests-detects_known_editors
  • rust-tap-tap-input-tests-alt_e_fires_editor_across_split_reads
  • rust-tap-tap-input-tests-ctrl_backslash_detaches
  • rust-tap-tap-input-tests-escape_then_other_key_passes_both
  • rust-tap-tap-input-tests-lone_escape_is_held_then_released
  • rust-tap-tap-input-tests-passes_normal_input_through
  • rust-tap-tap-names-tests-generates_three_dashed_segments
  • rust-tap-unusedCrateDependencies
  • rust-terminal-theme-cargoMachete
  • rust-terminal-theme-clippy
  • rust-terminal-theme-unusedCrateDependencies

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI review found issues in this pull request.

Verdict: patch is incorrect
Confidence: 0.84

The rich display path introduces a browser script execution surface for arbitrary HTML outputs and can still create unbounded dashboard payloads for wide or large-cell DataFrames.

  • P1 packages/dashboard-core/site/src/components/DisplayOutputs.svelte:30 Arbitrary rich HTML is executed in the dashboard iframe
  • P2 packages/mcp/src/python_worker.py:507 DataFrame HTML output is still effectively unbounded for wide or large-cell data

Comment on lines +30 to +35
<iframe
class="display-html"
class:expanded
title="output {i + 1}"
sandbox="allow-scripts"
srcdoc={bundle[mime]}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Arbitrary rich HTML is executed in the dashboard iframe

This renders every text/html bundle with sandbox="allow-scripts". The HTML is not limited to the worker-generated table; _repr_mimebundle_ and _repr_html_ output from arbitrary displayed objects reaches this path too. A sandbox without allow-same-origin prevents host DOM access, but it still lets the frame script read its own rendered output and send network requests, so displaying an untrusted object can execute browser-side JS and exfiltrate operator-only data. Keep arbitrary HTML script-disabled or sanitize/distinguish the trusted table renderer from third-party HTML.

Comment on lines +507 to +518
if module == "polars" and hasattr(obj, "columns") and hasattr(obj, "rows"):
columns = list(obj.columns)
dtypes = [str(t) for t in obj.dtypes]
height = int(obj.height)
rows = obj.head(MAX_HTML_ROWS).rows()
return {"columns": columns, "dtypes": dtypes, "rows": rows, "height": height}
if module == "pandas" and hasattr(obj, "columns") and hasattr(obj, "itertuples"):
columns = [str(c) for c in obj.columns]
dtypes = [str(t) for t in obj.dtypes]
height = int(len(obj))
rows = list(obj.head(MAX_HTML_ROWS).itertuples(index=False, name=None))
return {"columns": columns, "dtypes": dtypes, "rows": rows, "height": height}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge DataFrame HTML output is still effectively unbounded for wide or large-cell data

MAX_HTML_ROWS only limits row count; this path still collects all columns and later stringifies every cell at full length into the outputs JSON stored in the hub. A 500-row frame with thousands of columns, or even one huge string cell, can produce very large worker responses and Loro text bodies despite the existing stdout/result caps. Add column, cell-length, or total rendered-byte limits before serializing the dashboard table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants