MILAB-6463: select graph PFrame columns via the discover engine#1708
Conversation
🦋 Changeset detectedLatest commit: 22830cc The changes in this PR will be included in the next version bump. This PR includes changesets to release 15 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
There was a problem hiding this comment.
Code Review
This pull request refactors the compatible column discovery logic in getRelatedColumns to utilize the spec-driver discover engine (ColumnCollectionBuilder) instead of the previous heuristic-based approach. Feedback suggests explicitly adding rootColumns as a source to the ColumnCollectionBuilder to prevent potential errors and silent fallbacks to conservative mode when those columns are not yet registered in the context sources.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
7f9e9fe to
321a634
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1708 +/- ##
=======================================
Coverage 55.28% 55.28%
=======================================
Files 312 313 +1
Lines 17508 17575 +67
Branches 3819 3835 +16
=======================================
+ Hits 9679 9717 +38
- Misses 6595 6622 +27
- Partials 1234 1236 +2 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
getRelatedColumns gates its partial-axis-match candidates through the spec-driver discover engine, dropping result-pool columns that share only a low-cardinality context axis (e.g. pl7.app/sampleId) with the block but cannot actually join it — the cause of "axes sets are disjoint" errors. Block columns and linkers are always kept; an unavailable verdict keeps the column.
321a634 to
22830cc
Compare
Problem
getRelatedColumns(used bycreatePFrameForGraphs, which builds the PFrame behind every block's graphs) selected related result-pool columns with a partial-axis-match heuristic. A column that shared a single low-cardinality context axis with the block (e.g.pl7.app/sampleId) was admitted, and its other, foreign axis was then seeded into the candidate set — dragging in an entire disconnected column universe. The resulting PFrame fails on join withSpecError … some of axes sets are disjoint.Concretely, in
clonotype-spacean IG-anchored graph PFrame pulled in a redefined-clonotype universe (a sibling block's columns sharing onlysampleId), and the multi-sequence-alignment join crashed.Fix
Decide column membership with the spec-driver discover engine (
ColumnCollectionBuilder,enrichmentmode) instead of the JS heuristic:enrichmentmode rejects a hit that would introduce a floating/disconnected axis.Column assembly (
enrichCompatible) is unchanged; only the selection step changed.Behavior verified (clonotype-space, all input anchors)
SpecError).Known deviations from the previous heuristic (documented inline, for follow-up)
These are inherent to selecting via discover while still assembling with
enrichCompatible, and are noted as code comments ingetRelatedColumns:enrichCompatible'sblockAxesheuristic, so a hit needing a non-trivial qualification to integrate may not be qualified.allAxes"hanging" set) are not rediscovered — discovery anchors onrootColumnsonly.rootColumnspec; very large block column sets may stress the spec-driver (QuickJS) deadline.Scope / testing note
createPFrameForGraphsruns for every block's graphs. This was verified end-to-end onclonotype-spaceacross all three anchors; it should go through the monorepo test suite before merge given the broad reach.Greptile Summary
This PR fixes a disjoint-axis
SpecErrorincreatePFrameForGraphsby replacing a loose JS heuristic ingetRelatedColumnswith the spec-driver discover engine (enrichmentmode). The old heuristic admitted result-pool columns that shared any single axis with the block, which could drag in entire disconnected universes; the new path only accepts columns that the discover engine can integrate without leaving a floating axis.candidatesnow contains all predicate-matching result-pool columns, and membership is decided byAnchoredColumnCollection.findColumns({ mode: "enrichment" }); root columns (anchors) and linkers that share an axis with the block are unconditionally kept.integrableIdsis leftundefinedand only block columns + directly-connected linkers are returned — rather than re-admitting the full candidate set.Confidence Score: 3/5
The core fix is sound and verified end-to-end on clonotype-space, but the outer catch silently suppresses all discover-engine errors — including the QuickJS deadline explicitly called out as a risk for large column sets — which could produce unexpectedly empty graphs in production with no diagnostic trail.
The silent catch on a code path with a known stress trigger (large block column sets hitting QuickJS deadline) is the main concern: the graph silently loses enrichment columns and nothing surfaces to the operator. Combined with the potential ID-namespace mismatch and the still-heuristic linker path, the change needs the logging gap addressed and a broader test run before it can be considered fully safe.
sdk/model/src/pframe_utils/columns.ts — specifically the bare catch block around the discover-engine call and the linker-connectivity check.
Important Files Changed
getRelatedColumnswith the spec-driver discover engine (enrichment mode) for candidate filtering; linkers still use a JS heuristic; silent catch on discover failure has no logging; potential PObjectId namespace mismatch between integrableIds and candidates.Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[getRelatedColumns called rootColumns + predicate] --> B[Build blockAxes from rootColumns axes] B --> C[Collect linkerColumns matching predicate] C --> D[getAvailableWithLinkersAxes expand blockAxes with connected linker axes] D --> E[Fetch candidates columns.getColumns predicate] E --> F{anchorSpecs non-empty?} F -- No --> G[integrableIds = undefined] F -- Yes --> H[ColumnCollectionBuilder .addSources ctx .build with anchors] H -- throws --> G H -- ok --> I[collection.findColumns mode: enrichment] I -- throws --> G I -- ok --> J[integrableIds = Set of discovered IDs collection.dispose] G --> K[Filter candidates] J --> K K --> L{rootIds.has c.id?} L -- Yes --> M[include] L -- No --> N{isLinkerColumn?} N -- Yes --> O{linkerConnectsToBlock? JS heuristic on blockAxes} O -- Yes --> M O -- No --> P[exclude] N -- No --> Q{integrableIds?.has c.id} Q -- true --> M Q -- false/undefined --> P M --> R[enrichCompatible assemble final PFrameDef]%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%% flowchart TD A[getRelatedColumns called rootColumns + predicate] --> B[Build blockAxes from rootColumns axes] B --> C[Collect linkerColumns matching predicate] C --> D[getAvailableWithLinkersAxes expand blockAxes with connected linker axes] D --> E[Fetch candidates columns.getColumns predicate] E --> F{anchorSpecs non-empty?} F -- No --> G[integrableIds = undefined] F -- Yes --> H[ColumnCollectionBuilder .addSources ctx .build with anchors] H -- throws --> G H -- ok --> I[collection.findColumns mode: enrichment] I -- throws --> G I -- ok --> J[integrableIds = Set of discovered IDs collection.dispose] G --> K[Filter candidates] J --> K K --> L{rootIds.has c.id?} L -- Yes --> M[include] L -- No --> N{isLinkerColumn?} N -- Yes --> O{linkerConnectsToBlock? JS heuristic on blockAxes} O -- Yes --> M O -- No --> P[exclude] N -- No --> Q{integrableIds?.has c.id} Q -- true --> M Q -- false/undefined --> P M --> R[enrichCompatible assemble final PFrameDef]Prompt To Fix All With AI
Reviews (1): Last reviewed commit: "MILAB-6463: select graph PFrame columns ..." | Re-trigger Greptile