Naive implementation of chunking reader#288
Naive implementation of chunking reader#288jpsamaroo wants to merge 5 commits intoJuliaData:masterfrom
Conversation
|
Possible steps to solve this problem:
|
90f3b5d to
08bc67b
Compare
|
Per the hackathon discussion, step 1 is already handled; step 2 would probably be a good idea too. 3 and 4 to follow. Additionally, for step 4, we should probably provide a utility function (or just a slightly different kwarg to |
08bc67b to
d748ea5
Compare
Import BlockIO/ChunkIter from Dagger Wire blocking into loadtable
d748ea5 to
b3dffb1
Compare
|
I almost forgot, I still need to actually implement incremental saving of read blocks to the output file when specified, otherwise we'll still read the whole CSV's data into memory before serializing back out. |
|
Quick update for onlookers: the latest commit attempts to split individual files into blocks before calling |
f9c3bfb to
9c6d5dd
Compare
|
Bump, anyone up for reviewing this? |
| # Break file into blocks of size `blocksize` or less | ||
| fsize = filesize(file) | ||
| nblocks = max(div(fsize, blocksize), 1) | ||
| bios = blocks(file, '\n', nblocks) |
There was a problem hiding this comment.
nice, I love that '\n' feature.
|
Looks like some change in TextParse 1.0 is breaking the ability to pass EDIT: |
Replaces #129
TODO:
loadtable