Add full_table mode for single-relation Iceberg loading#253
Open
hbarthels wants to merge 2 commits into
Open
Conversation
Adds an alternative `(full_table :name [T1 T2 ...])` syntax to `iceberg_data`, enabling all columns to be loaded into a single relation keyed by the row ID (UInt128). Previously only the per-column `(columns (column ...) ...)` mode was supported. Proto: adds `IcebergTarget` message and optional `target` field (field 7) to `IcebergData`. Grammar: adds `full_table` nonterminal and relaxes `iceberg_data` to accept `gnf_columns? full_table?` (exactly one must be present at runtime). All SDK parsers/printers regenerated; new round-trip test added. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
comnik
reviewed
Jun 2, 2026
| (catalog_uri "https://catalog.example") | ||
| (properties (prop "type" "rest")) | ||
| (auth_properties)) | ||
| (full_table :edges [UINT128 INT INT]) |
Collaborator
There was a problem hiding this comment.
I'll read up on this tomorrow, but one immediate question is how we'll handle the incremental loading and delta outputs. I guess in that case we'd still get separate outputs for inserts and deletes?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
(full_table :name [T1 T2 ...])syntax toiceberg_data, enabling all Iceberg columns to be loaded into a single relation keyed by the row ID (UInt128).IcebergTargetproto message andoptional IcebergTarget target = 7field toIcebergData; existingcolumnsfield is unchanged.full_tablenonterminal and changesiceberg_datato usegnf_columns? full_table?(one must be present; mutually exclusive by convention — same pattern ascsv_locator_paths? csv_locator_inline_data?).New syntax
Test plan
make testpasses (720+ Python, 32k Julia, Go tests — no regressions)tests/lqp/iceberg_data_full_table.lqpparses and pretty-prints back to the same formtests/bin/iceberg_data_full_table.bingenerated and committedNotes for reviewer
gnf_columns? full_table?rather than a strict(gnf_columns | full_table)alternation. A proper XOR would require an intermediate nonterminal with a union type, which the grammar type system doesn't support natively. Semantic enforcement (exactly one present) is left to downstream validation.🤖 Generated with Claude Code