Skip to content

fix: include per-column details in exportBatch row count mismatch error#4214

Draft
andygrove wants to merge 1 commit intoapache:mainfrom
andygrove:improve-export-batch-error-message
Draft

fix: include per-column details in exportBatch row count mismatch error#4214
andygrove wants to merge 1 commit intoapache:mainfrom
andygrove:improve-export-batch-error-message

Conversation

@andygrove
Copy link
Copy Markdown
Member

Summary

  • When NativeUtil.exportBatch detects columns with mismatched row counts, the error message now includes each column's index, row count, and vector class name
  • Previously the message only showed the distinct row counts (e.g., [ArrayBuffer(8192, 0)]), making it hard to identify which column was problematic
  • The new message format is: col[0]=8192 (CometDecodedVector), col[1]=8192 (CometDecodedVector), col[2]=0 (CometPlainVector)

Context

This improves diagnostics for #4211 where native_iceberg_compat scans at scale hit this error. Knowing the column index and vector type will help identify the root cause.

Test plan

  • Existing tests pass (no behavioral change for valid batches)

🤖 Generated with Claude Code

When columns in a batch have mismatched row counts, the error message
now reports each column's index, row count, and vector class name
instead of just the distinct row counts. This helps diagnose the
root cause of issues like apache#4211.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant