Skip to content

Wire VirtualPoolHandle stats from Rust via single FFM call#5

Open
Bukhtawar wants to merge 3 commits into
rayshrey:writer-memory-poolfrom
Bukhtawar:virtual-pool-stats
Open

Wire VirtualPoolHandle stats from Rust via single FFM call#5
Bukhtawar wants to merge 3 commits into
rayshrey:writer-memory-poolfrom
Bukhtawar:virtual-pool-stats

Conversation

@Bukhtawar
Copy link
Copy Markdown

Summary

  • Adds parquet_get_pool_stats FFM function that returns write and merge pool stats (used, peak, limit) in a single native call instead of 6 separate FFM invocations
  • Registers write/merge virtual pools on ArrowNativeAllocator during plugin initialization, with a stats refresher callback that populates them on demand when _nodes/stats is called
  • Makes VirtualPoolHandle public and adds setVirtualPoolStatsRefresher so the parquet plugin can wire the Rust→Java stats bridge

Test plan

  • Unit tests for virtual pool registration, stats update, refresher invocation, and unsupported-child enforcement (ArrowNativeAllocatorTests)
  • Java compilation verified (arrow-base + parquet-data-format)
  • Spotless formatting passes
  • Rust cargo check (requires native toolchain)
  • Integration test with _nodes/stats endpoint showing write/merge pool stats

rayshrey and others added 3 commits May 25, 2026 09:58
Signed-off-by: rayshrey <rayshrey@amazon.com>
…eAllocator

Introduce VirtualPoolHandle for Rust-side memory pools that track
allocations locally but report stats to the Java ArrowNativeAllocator
for a unified view. The Rust write/merge MemoryPool reports its
used/peak via FFM, and ArrowNativeAllocator.stats() returns all pools
(Java Arrow + Rust virtual) in a single NativeAllocatorPoolStats.

- VirtualPoolHandle: volatile used/peak, updated by Rust via FFM
- registerVirtualPool(name, limit): creates a virtual pool entry
- stats() now iterates both real pools and virtual pools

Usage:
  VirtualPoolHandle writePool = allocator.registerVirtualPool("write", writeLimit);
  // Rust periodically calls: writePool.updateStats(used, peak)
  // stats() returns: {flight: ..., ingest: ..., query: ..., write: ...}

Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>
- Add parquet_get_pool_stats FFM function that writes 6×i64 (write/merge
  used, peak, limit) in one native call
- Add RustBridge.getPoolStats() Java-side wrapper using confined Arena
- Register write/merge virtual pools in ParquetDataFormatPlugin and set
  a stats refresher callback that fetches Rust stats on demand
- Make VirtualPoolHandle public so parquet plugin can access updateStats
- Add setVirtualPoolStatsRefresher to ArrowNativeAllocator, called before
  stats() collects pool data
- Add unit tests for virtual pool registration, stats update, refresher
  invocation, and unsupported-child enforcement
@rayshrey rayshrey force-pushed the writer-memory-pool branch from b4117dd to 20c7987 Compare May 26, 2026 11:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants