fix(plugin-chart-ag-grid-table): use display text for filter and sort on HTML cells#39885
fix(plugin-chart-ag-grid-table): use display text for filter and sort on HTML cells#39885fitzee wants to merge 3 commits intoapache:masterfrom
Conversation
richardfogaca
left a comment
There was a problem hiding this comment.
Posting on Richard's behalf — this is his PR reviewer agent. Forward any pushback to him and he'll loop me back in.
Left two functional notes below. The plain-string comparator change looks like the main thing to check; the server-pagination note may just need an explicit scope/test decision. All line numbers verified against HEAD df794f79aa.
Functional — worth investigating before merge
-
superset-frontend/plugins/plugin-chart-ag-grid-table/src/utils/htmlTextFilterValueGetter.ts:49localeComparenow runs for everyGenericDataType.Stringcolumn, not just the HTML cases. That does not preserve the previous AG Grid default string ordering for plain values; for example, simple JS comparison ordersZ < a, whileZ.localeCompare('a')returns the opposite order in the default runtime locale.WDYT — could we normalize HTML values to visible text but keep AG Grid/default comparator semantics for the final comparison, so plain string columns stay unaffected as the PR description says?
-
superset-frontend/plugins/plugin-chart-ag-grid-table/src/utils/useColDefs.ts:384In server-pagination mode this later spread replaces the string-column
htmlTextComparatorfrom line 328 with() => 0, so server-paginated Table V2 charts will still delegate sorting to the server/raw column value path. That seems like it can leave the HTML visible-text sort gap open for the same kind of calculated HTML string column when server pagination is enabled.WDYT — should this mode be supported too, or should the PR make the client-side-only scope explicit and cover that expectation in tests/docs?
Praise
-
superset-frontend/plugins/plugin-chart-ag-grid-table/src/utils/htmlTextFilterValueGetter.test.ts:30Nice focused coverage around visible-text extraction, nested markup, pass-through values, null handling, and comparator ordering. It makes the intended behavior of the new utility easy to reason about.
✅ Deploy Preview for superset-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Thanks for the review (and please pass thanks to the bot wrangler). Both points addressed in 8eab7e9: 1. Comparator preserving default semantics — Replaced 2. Server-pagination scope — Agreed: server-paginated tables delegate sort to the backend (via the existing |
richardfogaca
left a comment
There was a problem hiding this comment.
LGTM, congrats on your first superset PR @fitzee 💯
Code Review Agent Run #31e869Actionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #39885 +/- ##
=======================================
Coverage 64.37% 64.37%
=======================================
Files 2569 2570 +1
Lines 134745 134765 +20
Branches 31278 31288 +10
=======================================
+ Hits 86737 86757 +20
Misses 46510 46510
Partials 1498 1498
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
… on HTML cells The Table V2 chart's TextCellRenderer renders cells containing HTML markup (e.g. anchor tags) via dangerouslySetInnerHTML, but the column's filterValueGetter and comparator were not configured for string columns. As a result both filter and sort operated on the raw HTML string instead of the visible text: - Column sort ordered rows by raw HTML lexically, so the URL inside the markup dictated ordering instead of the visible label (e.g. all rows with a `https://aaa/...` URL sorted before rows with `https://bbb/...` regardless of the label shown to the user). - The text "Contains" filter matched against the raw HTML, causing false positives on URL fragments and false negatives when typing the visible label. Add a small `htmlTextFilterValueGetter` utility (with a matching `htmlTextComparator`) that extracts visible text from HTML values via DOMParser + sanitizeHtml, and wire it into the colDef for string columns. Pass-through for non-HTML values.
… document client-side scope Address review feedback: - `htmlTextComparator` now uses codepoint comparison (`<` / `>`) instead of `localeCompare`, so it matches AG Grid's default string comparator semantics for plain (non-HTML) values. Locale-aware comparison flipped the order of e.g. `'Z'` vs `'a'`, which would have changed sort behavior for plain string columns the PR description claims are unaffected. Added a test asserting the default ordering is preserved. - Added a comment in `useColDefs.ts` documenting the client-side scope: when `serverPagination` is enabled, sorting is delegated to the server via the existing `comparator: () => 0` override, so HTML normalization does not apply. Server-paginated tables with HTML columns are out of scope for this fix and would require server-side handling.
8eab7e9 to
517dd90
Compare
| ...(dataType === GenericDataType.String && { | ||
| // HTML cells (e.g. anchor markup) are rendered by TextCellRenderer | ||
| // via dangerouslySetInnerHTML; without these the filter and sort | ||
| // operate on raw HTML so the URL inside the markup dictates order | ||
| // and the "Contains" filter matches against the raw HTML string. | ||
| // | ||
| // Scope: client-side only. When `serverPagination` is enabled, a | ||
| // later spread overrides `comparator` with `() => 0` so sorting is | ||
| // delegated to the server; the database does not know to extract | ||
| // visible text from HTML, so server-paginated tables with HTML | ||
| // columns are out of scope for this fix. | ||
| filterValueGetter: htmlTextFilterValueGetter, | ||
| comparator: htmlTextComparator, |
There was a problem hiding this comment.
🟠 Architect Review — HIGH
When serverPagination is enabled, string columns still use htmlTextFilterValueGetter on the client while server-side filtering continues to operate on raw HTML values, so client-side filter evaluation is based on visible text but the backend query is based on the unstripped HTML, leading to inconsistent or empty results for the same filter (e.g. HTML with nested tags where the visible text is not a contiguous substring of the raw HTML).
Suggestion: Gate htmlTextFilterValueGetter behind client-side mode for string columns when serverPagination is true, or introduce equivalent HTML-to-text normalization in the server-side filter conversion so both client and server evaluate filters against the same text value.
Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is an **Architect / Logical Review** comment left during a code review. These reviews are first-class, important findings — not optional suggestions. Do NOT dismiss this as a 'big architectural change' just because the title says architect review; most of these can be resolved with a small, localized fix once the intent is understood.
**Path:** superset-frontend/plugins/plugin-chart-ag-grid-table/src/utils/useColDefs.ts
**Line:** 323:335
**Comment:**
*HIGH: When serverPagination is enabled, string columns still use htmlTextFilterValueGetter on the client while server-side filtering continues to operate on raw HTML values, so client-side filter evaluation is based on visible text but the backend query is based on the unstripped HTML, leading to inconsistent or empty results for the same filter (e.g. HTML with nested tags where the visible text is not a contiguous substring of the raw HTML).
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
If a suggested approach is provided above, use it as the authoritative instruction. If no explicit code suggestion is given, you MUST still draft and apply your own minimal, localized fix — do not punt back with 'no suggestion provided, review manually'. Keep the change as small as possible: add a guard clause, gate on a loading state, reorder an await, wrap in a conditional, etc. Do not refactor surrounding code or expand scope beyond the finding.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix| const stripHtmlToText = (html: string): string => { | ||
| const doc = new DOMParser().parseFromString(sanitizeHtml(html), 'text/html'); | ||
| return (doc.body.textContent || '').trim(); | ||
| }; |
There was a problem hiding this comment.
Suggestion: The comparator sanitizes and parses HTML during every comparison call, and sorting invokes the comparator many times (O(n log n)), which can cause major UI slowdowns on larger tables. Precompute/cache the extracted text once per row value (for example via value getter or memoization) and compare cached text instead of reparsing inside comparator hot path. [performance]
Severity Level: Major ⚠️
- ⚠️ Sorting large HTML string columns can lag noticeably.
- ⚠️ Comparator does DOMParser work on every comparison call.Steps of Reproduction ✅
1. Create a Table V2 (AG Grid) chart via `TableChart` in
`superset-frontend/plugins/plugin-chart-ag-grid-table/src/AgGridTableChart.tsx:55-89`,
with a string dimension whose values contain HTML (e.g., anchor tags rendered by
`TextCellRenderer`), and a large row count (thousands of records).
2. `TableChart` builds column definitions using `useColDefs`
(`superset-frontend/plugins/plugin-chart-ag-grid-table/src/utils/useColDefs.ts:215-311`);
for string columns (`dataType === GenericDataType.String`) it sets `filterValueGetter:
htmlTextFilterValueGetter` and `comparator: htmlTextComparator` on the colDef at lines
44-57.
3. When the user clicks that column's header to sort, AG Grid (via `ThemedAgGridReact`,
imported in `AgGridTableChart.tsx:29-32`) repeatedly invokes `htmlTextComparator` from
`htmlTextFilterValueGetter.ts:47-57` as part of its `O(n log n)` sort, passing the raw
cell values for many pairwise comparisons.
4. For each comparison where the value looks like HTML, `htmlTextComparator` calls
`stripHtmlToText` (`htmlTextFilterValueGetter.ts:22-25`), which runs `sanitizeHtml(html)`
and `new DOMParser().parseFromString(..., 'text/html')`. With thousands of HTML cells,
this DOM parsing and sanitization happens thousands of times in a hot sort loop, which can
realistically cause noticeable UI sluggishness while sorting large HTML-heavy tables.
Precomputing or caching the stripped text per row would avoid this repeated work.Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** superset-frontend/plugins/plugin-chart-ag-grid-table/src/utils/htmlTextFilterValueGetter.ts
**Line:** 22:25
**Comment:**
*Performance: The comparator sanitizes and parses HTML during every comparison call, and sorting invokes the comparator many times (`O(n log n)`), which can cause major UI slowdowns on larger tables. Precompute/cache the extracted text once per row value (for example via value getter or memoization) and compare cached text instead of reparsing inside comparator hot path.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix|
The flagged issue is correct: when serverPagination is enabled, client-side filtering uses htmlTextFilterValueGetter to strip HTML for visible text, but server-side filtering operates on raw HTML, causing inconsistent results. To resolve, gate the htmlTextFilterValueGetter and htmlTextComparator behind client-side mode by adding !serverPagination to the condition. superset-frontend/plugins/plugin-chart-ag-grid-table/src/utils/useColDefs.ts |
…lient-side mode Address two bot review comments: - **Performance:** `stripHtmlToText` ran `DOMParser` + `sanitizeHtml` on every comparator call. Sorts invoke the comparator O(n log n) times, which adds up on large tables. Cache extracted text by raw HTML string in a module-level Map; bounded by the number of unique HTML values in the dataset. - **Server-pagination consistency:** Gate the string-column branch on `!serverPagination`. In server-pagination mode, the typed filter value is converted into a backend query that operates on the raw HTML stored in the database; stripping HTML client-side via `filterValueGetter` caused a mismatch where the typed text was searched against visible labels client-side but raw HTML server-side, leading to false positives or empty results. Server-paginated tables with HTML columns remain out of scope (they revert to pre-PR behavior — same backend query path as before, no inconsistency). Sort in server-pagination mode is already delegated to the server via the existing `comparator: () => 0` override, so removing the `htmlTextComparator` here loses nothing in that mode.
|
Thanks for the bot review (and human reviewers — your patience). Both points addressed in the latest push ( 1. Performance — 2. Server-pagination consistency — The existing tests still pass; the change is a strict-superset of the previous behavior in client-side mode. |
Code Review Agent Run #718574Actionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
|
Small suggestion before merge: the Would it be worth caching the normalized comparator value per raw string, so each unique HTML cell only pays the detection/parsing cost once during sort? |
SUMMARY
The Table V2 chart's
TextCellRendererrenders cells containing HTML markup (e.g. anchor tags) viadangerouslySetInnerHTML, but the column definition'sfilterValueGetterandcomparatorare not configured for string columns. Both filter and sort therefore operate on the raw HTML string instead of the visible text:hrefstarts withhttps://aaa/...sort before rows withhttps://bbb/...regardless of what label is shown to the user.https) matches every HTML row.This change adds a small
htmlTextFilterValueGetterutility (with a matchinghtmlTextComparator) that extracts the visible text from HTML values viaDOMParser+sanitizeHtml, and wires it into the colDef for string columns. Pass-through for non-HTML values, so plain string columns are unaffected — the comparator deliberately matches AG Grid's default codepoint ordering ('Z'before'a') so plain string columns behave identically to before.Scope: Client-side sort and filter only. When
serverPaginationis enabled, the existingcomparator: () => 0override delegates sorting to the backend; the database does not know to extract visible text from HTML, so server-paginated tables with HTML columns would need server-side handling and are out of scope for this PR.BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
Not applicable — the bug is observable via sort order rather than a visually distinct rendering.
TESTING INSTRUCTIONS
AG_GRID_TABLE_ENABLEDfeature flag.'<a href="https://example.com/' || id || '">label_' || id || '</a>'Unit tests added in
htmlTextFilterValueGetter.test.tscover HTML extraction (anchor and nested markup), pass-through for plain strings, nulls, and numbers, plus the comparator's ordering, null handling, and preservation of AG Grid's default codepoint ordering for plain strings.ADDITIONAL INFORMATION
AG_GRID_TABLE_ENABLED