use iteration in set ops, wrap compressed sketch and unpack in iterator#644
use iteration in set ops, wrap compressed sketch and unpack in iterator#644AlexanderSaydakov merged 4 commits intomainfrom
Conversation
|
This should speed up union of wrapped compressed Theta sketches |
There was a problem hiding this comment.
I did not quite get the logic in DirectCompactCompressedSketch.getRetainedEntries() (I'm not saying it's wrong, just that I haven't yet stared at it enough to realize what exactly it's doing) but there were a few other things that seemed worth noting.
| if (hash < unionThetaLong_ && hash < gadget_.getThetaLong()) { | ||
| gadget_.hashUpdate(hash); // backdoor update, hash function is bypassed | ||
| } else { | ||
| if (sketchIn.isOrdered()) { break; } |
There was a problem hiding this comment.
The comment on intersection appeared higher in my review. So the longer version of the relevant comment is that one.
| matchSet[matchSetCount++] = hashIn; | ||
| } | ||
| } else { | ||
| if (sketchIn.isOrdered()) { break; } // early stop |
There was a problem hiding this comment.
This won't change over the course of the intersection. Should it be a flag pre-computed in advance to avoid a function call every time? Not sure if the JVM can easily determine there's no chance of a side-effect from other code.
There was a problem hiding this comment.
Generally I prefer not to introduce aliases, but if you think this can help, I can do that. Perhaps we could try measuring the effect one day.
| class DirectCompactCompressedSketch extends DirectCompactSketch { | ||
| /** | ||
| * Construct this sketch with the given memory. | ||
| * @param mem Read-only Memory object with the order bit properly set. |
There was a problem hiding this comment.
Rather than "properly set" maybe just say it must be ordered and with the bit set to true?
There was a problem hiding this comment.
I think this refers to memory byte order. Perhaps can be rephrased for clarity.
|
|
||
| /** | ||
| * Wraps the given Memory, which must be a SerVer 4 compressed CompactSketch image. | ||
| * Must check the validity of the Memory before calling. The order bit must be set properly. |
|
Ok, for some reason I was thinking it was going over the entries themselves, not the value representing the number of entries. Got it now and it's fine. |
|
If it was a bit difficult and confusing, perhaps some comments are needed. Let me see what I can do. |
No description provided.