Skip to content

Discrepancy in Dictionary Value in streaming IPC message #65

@manojVivek

Description

@manojVivek

Hi, we are in the process of migrating an application from apache-arrow to flechette and we ran into an unit test failure that sounds like a bug in Flechette's streaming IPC handling. Below is a minimal reproduction code:

import { tableFromIPC } from '@uwdata/flechette';

// IPC stream chunks:
const chunksBase64 = [
  '3AAAABAAAAAAAAoADAAKAAkABAAKAAAAEAAAAAABBAAIAAgAAAAEAAgAAAAEAAAAAQAAABQAAAAQABQAEAAOAA8ABAAAAAgAEAAAABgAAAAMAAAAAAABDXAAAAABAAAAGAAAALD///8QABgAFAAOAA8ABAAQAAgAEAAAADwAAAAwAAAAAAABBRAAAAAwAAAACAAKAAAABAAIAAAADAAAAAAABgAIAAQABgAAACAAAAAAAAAABAAEAAQAAAAEAAAAbm9kZQAAAAATAAAAYXR0cmlidXRlc19yZXNvdXJjZQA=',
  'qAAAABAAAAAMABgAFgAVAAQACAAMAAAAHAAAAMAAAAAAAAAAAAAAAAACBAAIAAoAAAAEAAgAAAAQAAAAAAAKABgADAAIAAQACgAAACwAAAAQAAAAAQAAAAAAAAAAAAAAAQAAAAEAAAAAAAAAAAAAAAAAAAAAAAAAAwAAAAAAAAAAAAAAAQAAAAAAAABAAAAAAAAAAAgAAAAAAAAAgAAAAAAAAAAzAAAAAAAAAP8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAZ2tlLWV1cm9wZS13ZXN0My0wLXByZWVtcHRpYmxlLXQyZC1zdC1lYzI3ZDNkYi1wd3d6AAAAAAAAAAAAAAAAAA==',
  'qAAAABAAAAAMABoAGAAXAAQACAAMAAAAIAAAAMAAAAAAAAAAAAAAAAAAAAMEAAoAGAAMAAgABAAKAAAAPAAAABAAAAABAAAAAAAAAAAAAAACAAAAAQAAAAAAAAAAAAAAAAAAAAEAAAAAAAAAAAAAAAAAAAAAAAAAAwAAAAAAAAAAAAAAAQAAAAAAAABAAAAAAAAAAAEAAAAAAAAAgAAAAAAAAAAEAAAAAAAAAP8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA==',
  'qAAAABAAAAAMABgAFgAVAAQACAAMAAAAHAAAAAABAAAAAAAAAAAAAAACBAAIAAoAAAAEAAgAAAAQAAAAAAAKABgADAAIAAQACgAAACwAAAAQAAAAAgAAAAAAAAAAAAAAAQAAAAIAAAAAAAAAAAAAAAAAAAAAAAAAAwAAAAAAAAAAAAAAAQAAAAAAAABAAAAAAAAAAAwAAAAAAAAAgAAAAAAAAABmAAAAAAAAAP8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMwAAAGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAZ2tlLWV1cm9wZS13ZXN0My0wLXByZWVtcHRpYmxlLXQyZC1zdC1hZGQxOTQzNS13NzR2Z2tlLWV1cm9wZS13ZXN0My0wLXByZWVtcHRpYmxlLXQyZC1zdC03MTc4ODhkYi1ucmZyAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=',
  'qAAAABAAAAAMABoAGAAXAAQACAAMAAAAIAAAAMAAAAAAAAAAAAAAAAAAAAMEAAoAGAAMAAgABAAKAAAAPAAAABAAAAACAAAAAAAAAAAAAAACAAAAAgAAAAAAAAAAAAAAAAAAAAIAAAAAAAAAAAAAAAAAAAAAAAAAAwAAAAAAAAAAAAAAAQAAAAAAAABAAAAAAAAAAAEAAAAAAAAAgAAAAAAAAAAIAAAAAAAAAP8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA==',
];

const chunks = chunksBase64.map(b64 => {
  const binary = atob(b64);
  const bytes = new Uint8Array(binary.length);
  for (let i = 0; i < binary.length; i++) {
    bytes[i] = binary.charCodeAt(i);
  }
  return bytes;
});

const table = tableFromIPC(chunks);
const col = table.getChildAt(0);

console.log('Row 0:', col.at(0).node);
// Expected: "...ec27d3db-pwwz" (original dictionary)
// Actual:   "...add19435-w74v"

console.log('Row 1:', col.at(1).node);  // Expected: "...add19435-w74v"
console.log('Row 2:', col.at(2).node);  // Expected: "...717888db-nrfr"

Basically, the value at row[0].node should be gke-europe-west3-0-preemptible-t2d-st-ec27d3db-pwwz but it is gke-europe-west3-0-preemptible-t2d-st-add19435-w74v which is wrong.

I did some initial debugging with Claude and it suggests a bug in the dictionary replacement support.

Let me know what do you think. I can work on a fix, if you can confirm this is a bug.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions