Skip to content

MILAB-6463: select graph PFrame columns via the discover engine#1708

Merged
vadimpiven merged 1 commit into
mainfrom
MILAB-6463_graph-pframe-discover-integration
Jun 19, 2026
Merged

MILAB-6463: select graph PFrame columns via the discover engine#1708
vadimpiven merged 1 commit into
mainfrom
MILAB-6463_graph-pframe-discover-integration

Conversation

@vadimpiven

@vadimpiven vadimpiven commented Jun 18, 2026

Copy link
Copy Markdown
Member

Problem

getRelatedColumns (used by createPFrameForGraphs, which builds the PFrame behind every block's graphs) selected related result-pool columns with a partial-axis-match heuristic. A column that shared a single low-cardinality context axis with the block (e.g. pl7.app/sampleId) was admitted, and its other, foreign axis was then seeded into the candidate set — dragging in an entire disconnected column universe. The resulting PFrame fails on join with SpecError … some of axes sets are disjoint.

Concretely, in clonotype-space an IG-anchored graph PFrame pulled in a redefined-clonotype universe (a sibling block's columns sharing only sampleId), and the multi-sequence-alignment join crashed.

Fix

Decide column membership with the spec-driver discover engine (ColumnCollectionBuilder, enrichment mode) instead of the JS heuristic:

  • A candidate is kept only if discovery integrates it with the block's columns through linkers — enrichment mode rejects a hit that would introduce a floating/disconnected axis.
  • Block columns (anchors) always stay.
  • Linkers stay only when they connect to the block axis universe (a linker reaching a different universe would itself leave the join disjoint).
  • When discovery is unavailable (no anchors, or it throws), the selection is conservative — block columns + connected linkers only — rather than keeping all candidates, which would re-admit disjoint universes. With no anchors this yields an essentially empty result, matching prior behavior.

Column assembly (enrichCompatible) is unchanged; only the selection step changed.

Behavior verified (clonotype-space, all input anchors)

  • IG — graph PFrame contains only the IG universe (+ connected linkers); alignment renders.
  • Redefined (IG / VDJRegion aa) — redefined-only universe, no disconnected linkers; alignment returns rows (previously the disjoint SpecError).
  • TCRAB — empty, which is correct: this dataset has no TCR clonotypes (Beta/Alpha = 0 rows at the MiXCR source). The previous heuristic masked this by leaking in unrelated data.

Known deviations from the previous heuristic (documented inline, for follow-up)

These are inherent to selecting via discover while still assembling with enrichCompatible, and are noted as code comments in getRelatedColumns:

  1. Discover qualifications / linker paths are not applied at assembly; included columns are still qualified by enrichCompatible's blockAxes heuristic, so a hit needing a non-trivial qualification to integrate may not be qualified.
  2. Label columns no longer get the looser "some axis matches" rule they had — they pass through the same discover verdict, so loosely-reachable axis-label columns may be dropped.
  3. Columns reachable only via a non-anchor intermediate column (the old allAxes "hanging" set) are not rediscovered — discovery anchors on rootColumns only.
  4. Discovery runs one anchor per distinct rootColumn spec; very large block column sets may stress the spec-driver (QuickJS) deadline.

Scope / testing note

createPFrameForGraphs runs for every block's graphs. This was verified end-to-end on clonotype-space across all three anchors; it should go through the monorepo test suite before merge given the broad reach.

Greptile Summary

This PR fixes a disjoint-axis SpecError in createPFrameForGraphs by replacing a loose JS heuristic in getRelatedColumns with the spec-driver discover engine (enrichment mode). The old heuristic admitted result-pool columns that shared any single axis with the block, which could drag in entire disconnected universes; the new path only accepts columns that the discover engine can integrate without leaving a floating axis.

  • Column selection rewritten: candidates now contains all predicate-matching result-pool columns, and membership is decided by AnchoredColumnCollection.findColumns({ mode: "enrichment" }); root columns (anchors) and linkers that share an axis with the block are unconditionally kept.
  • Conservative fallback: when the discover engine is unavailable or throws (e.g. anchor specs not in ctx sources, QuickJS deadline exceeded on large sets), integrableIds is left undefined and only block columns + directly-connected linkers are returned — rather than re-admitting the full candidate set.
  • Known deviations documented inline: label columns, columns reachable only via intermediate non-anchor columns, and linker-connectivity checks still use the old JS heuristic paths, flagged for follow-up.

Confidence Score: 3/5

The core fix is sound and verified end-to-end on clonotype-space, but the outer catch silently suppresses all discover-engine errors — including the QuickJS deadline explicitly called out as a risk for large column sets — which could produce unexpectedly empty graphs in production with no diagnostic trail.

The silent catch on a code path with a known stress trigger (large block column sets hitting QuickJS deadline) is the main concern: the graph silently loses enrichment columns and nothing surfaces to the operator. Combined with the potential ID-namespace mismatch and the still-heuristic linker path, the change needs the logging gap addressed and a broader test run before it can be considered fully safe.

sdk/model/src/pframe_utils/columns.ts — specifically the bare catch block around the discover-engine call and the linker-connectivity check.

Important Files Changed

Filename Overview
sdk/model/src/pframe_utils/columns.ts Replaces the JS axis-heuristic in getRelatedColumns with the spec-driver discover engine (enrichment mode) for candidate filtering; linkers still use a JS heuristic; silent catch on discover failure has no logging; potential PObjectId namespace mismatch between integrableIds and candidates.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[getRelatedColumns called
rootColumns + predicate] --> B[Build blockAxes
from rootColumns axes]
    B --> C[Collect linkerColumns
matching predicate]
    C --> D[getAvailableWithLinkersAxes
expand blockAxes with
connected linker axes]
    D --> E[Fetch candidates
columns.getColumns predicate]
    E --> F{anchorSpecs
non-empty?}
    F -- No --> G[integrableIds = undefined]
    F -- Yes --> H[ColumnCollectionBuilder
.addSources ctx
.build with anchors]
    H -- throws --> G
    H -- ok --> I[collection.findColumns
mode: enrichment]
    I -- throws --> G
    I -- ok --> J[integrableIds = Set of discovered IDs
collection.dispose]
    G --> K[Filter candidates]
    J --> K
    K --> L{rootIds.has c.id?}
    L -- Yes --> M[include]
    L -- No --> N{isLinkerColumn?}
    N -- Yes --> O{linkerConnectsToBlock?
JS heuristic on blockAxes}
    O -- Yes --> M
    O -- No --> P[exclude]
    N -- No --> Q{integrableIds?.has c.id}
    Q -- true --> M
    Q -- false/undefined --> P
    M --> R[enrichCompatible
assemble final PFrameDef]
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A[getRelatedColumns called
rootColumns + predicate] --> B[Build blockAxes
from rootColumns axes]
    B --> C[Collect linkerColumns
matching predicate]
    C --> D[getAvailableWithLinkersAxes
expand blockAxes with
connected linker axes]
    D --> E[Fetch candidates
columns.getColumns predicate]
    E --> F{anchorSpecs
non-empty?}
    F -- No --> G[integrableIds = undefined]
    F -- Yes --> H[ColumnCollectionBuilder
.addSources ctx
.build with anchors]
    H -- throws --> G
    H -- ok --> I[collection.findColumns
mode: enrichment]
    I -- throws --> G
    I -- ok --> J[integrableIds = Set of discovered IDs
collection.dispose]
    G --> K[Filter candidates]
    J --> K
    K --> L{rootIds.has c.id?}
    L -- Yes --> M[include]
    L -- No --> N{isLinkerColumn?}
    N -- Yes --> O{linkerConnectsToBlock?
JS heuristic on blockAxes}
    O -- Yes --> M
    O -- No --> P[exclude]
    N -- No --> Q{integrableIds?.has c.id}
    Q -- true --> M
    Q -- false/undefined --> P
    M --> R[enrichCompatible
assemble final PFrameDef]
Loading

Fix All in Claude Code

Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
sdk/model/src/pframe_utils/columns.ts:129-131
**Silent catch discards all discover-engine errors without logging**

The bare `catch` swallows every exception — including the QuickJS deadline that the PR description explicitly flags as a known risk for large block-column sets. When that happens, `integrableIds` stays `undefined`, the function returns only block columns + connected linkers, and graphs silently become empty or partial. There is no way for an operator or developer to know whether the conservative fallback fired due to a legitimate "no anchors" condition or due to a spec-driver crash/timeout. Adding at least a `console.warn` (or the project's standard logger) with the caught error and the number of anchor specs would make production incidents diagnosable without changing the fallback behaviour.

### Issue 2 of 3
sdk/model/src/pframe_utils/columns.ts:141-143
**Linker inclusion still uses JS axis-match heuristic, not the discover engine**

`linkerConnectsToBlock` admits any linker that shares at least one axis with the expanded `blockAxes` (root-column axes + axes reachable via already-connected linkers). This is the same "some axis overlaps" heuristic that the PR replaces for regular result-pool columns. A linker that shares only a low-cardinality context axis (e.g. `pl7.app/sampleId`) with the block would pass, and its foreign axes are already baked into `blockAxes` via `getAvailableWithLinkersAxes` — so a chained linker bridging a disconnected universe could still slip in. The PR acknowledges this as a known deviation; this comment surfaces it for the follow-up ticket, since the discover-engine path in `findColumns({ mode: "enrichment" })` would reject such a linker outright.

### Issue 3 of 3
sdk/model/src/pframe_utils/columns.ts:113-132
**`integrableIds` uses outputs-priority IDs; `candidates` uses result-pool IDs**

`collectCtxColumnSnapshotProviders` deduplicates by `NativePObjectId` and keeps the first-source (`outputs`) `PObjectId` when the same column appears in both `ctx.outputs` and `ctx.resultPool` (the `ctx_column_sources.ts` comment calls this out explicitly). `candidates`, however, comes from `columns.getColumns(...)` which is sourced exclusively from `ctx.resultPool` — and therefore carries the result-pool `PObjectId`. For any non-root column whose outputs ID differs from its result-pool ID, `integrableIds.has(c.id)` will be `false`, and the column is silently excluded even though the discover engine accepted it. Root columns are safe because of the `rootIds.has(c.id)` guard, but result-pool columns also exposed via the block's prerun output could be affected. Sourcing `candidates` from the same provider set as `collectCtxColumnSnapshotProviders`, or cross-mapping by spec's native ID, would close this gap.

Reviews (1): Last reviewed commit: "MILAB-6463: select graph PFrame columns ..." | Re-trigger Greptile

Greptile also left 3 inline comments on this PR.

@notion-workspace

Copy link
Copy Markdown

@changeset-bot

changeset-bot Bot commented Jun 18, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: 22830cc

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 15 packages
Name Type
@platforma-sdk/model Patch
@milaboratories/pl-middle-layer Patch
@milaboratories/uikit Patch
@platforma-sdk/test Patch
@platforma-sdk/ui-vue Patch
@milaboratories/milaboratories.monetization-test.model Patch
@milaboratories/milaboratories.monetization-test.ui Patch
@milaboratories/milaboratories.ui-examples.model Patch
@milaboratories/milaboratories.ui-examples.ui Patch
@milaboratories/milaboratories.ui-examples Patch
@milaboratories/milaboratories.pool-explorer.model Patch
@milaboratories/milaboratories.pool-explorer Patch
@platforma-sdk/pl-cli Patch
@milaboratories/milaboratories.pool-explorer.ui Patch
@milaboratories/milaboratories.monetization-test Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the compatible column discovery logic in getRelatedColumns to utilize the spec-driver discover engine (ColumnCollectionBuilder) instead of the previous heuristic-based approach. Feedback suggests explicitly adding rootColumns as a source to the ColumnCollectionBuilder to prevent potential errors and silent fallbacks to conservative mode when those columns are not yet registered in the context sources.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread sdk/model/src/pframe_utils/columns.ts Outdated
@vadimpiven vadimpiven force-pushed the MILAB-6463_graph-pframe-discover-integration branch from 7f9e9fe to 321a634 Compare June 18, 2026 14:38
Comment thread sdk/model/src/pframe_utils/columns.ts Outdated
Comment thread sdk/model/src/pframe_utils/columns.ts Outdated
Comment thread sdk/model/src/pframe_utils/columns.ts Outdated
@codecov

codecov Bot commented Jun 18, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 55.28%. Comparing base (06ffd7b) to head (22830cc).
⚠️ Report is 17 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
sdk/model/src/pframe_utils/columns.ts 0.00% 15 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1708   +/-   ##
=======================================
  Coverage   55.28%   55.28%           
=======================================
  Files         312      313    +1     
  Lines       17508    17575   +67     
  Branches     3819     3835   +16     
=======================================
+ Hits         9679     9717   +38     
- Misses       6595     6622   +27     
- Partials     1234     1236    +2     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

getRelatedColumns gates its partial-axis-match candidates through the
spec-driver discover engine, dropping result-pool columns that share only a
low-cardinality context axis (e.g. pl7.app/sampleId) with the block but cannot
actually join it — the cause of "axes sets are disjoint" errors. Block columns
and linkers are always kept; an unavailable verdict keeps the column.
@vadimpiven vadimpiven force-pushed the MILAB-6463_graph-pframe-discover-integration branch from 321a634 to 22830cc Compare June 19, 2026 08:40
@vadimpiven vadimpiven added this pull request to the merge queue Jun 19, 2026
Merged via the queue into main with commit 765de79 Jun 19, 2026
14 checks passed
@vadimpiven vadimpiven deleted the MILAB-6463_graph-pframe-discover-integration branch June 19, 2026 09:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants