Bug report
I have a 1 GiB file, and I'm getting different results when I read it with standard Python tooling and pyarrow; pyarrow bytes read are unrealistically small.
with open('train-00000-of-00007.parquet', 'rb') as gh:
%iops data = gh.read()
del data
======================================================================
IOPS Profile Results (strace (per-process))
======================================================================
Execution Time: 18.2150 seconds
Read Operations: 2
Write Operations: 0
Total Operations: 2
Bytes Read: 1.02 GB (1,091,305,162 bytes)
Bytes Written: 0.00 B (0 bytes)
Total Bytes: 1.02 GB (1,091,305,162 bytes)
----------------------------------------------------------------------
IOPS: 0.11 operations/second
Throughput: 57.14 MB/second
======================================================================
import pyarrow.parquet as pq
%iops pq.read_table('train-00000-of-00007.parquet')
======================================================================
IOPS Profile Results (strace (per-process))
======================================================================
Execution Time: 19.7621 seconds
Read Operations: 3
Write Operations: 3
Total Operations: 6
Bytes Read: 3.63 MB (3,808,731 bytes)
Bytes Written: 13.05 KB (13,360 bytes)
Total Bytes: 3.65 MB (3,822,091 bytes)
----------------------------------------------------------------------
IOPS: 0.30 operations/second
Throughput: 188.87 KB/second
======================================================================
I tried to do sync; sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches', but it didn't help.
Environment Information
Linux 6.8.0, x86_64, etx4, python 3.13, pyarrow 23, iops_profiler 0.2.0, ipython 9.9.0
Before submitting
Please check the following:
Bug report
I have a 1 GiB file, and I'm getting different results when I read it with standard Python tooling and pyarrow; pyarrow bytes read are unrealistically small.
I tried to do
sync; sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches', but it didn't help.Environment Information
Linux 6.8.0, x86_64, etx4, python 3.13, pyarrow 23, iops_profiler 0.2.0, ipython 9.9.0
Before submitting
Please check the following:
iops_profiler.__version__)