feat(sampling): add VarOptItemsUnion, serde, and cross-language compatibility tests#130
feat(sampling): add VarOptItemsUnion, serde, and cross-language compatibility tests#130Fengzdadi wants to merge 12 commits intoapache:mainfrom
Conversation
| packBoolsInto(out[markOffset:markOffset+markBytes], s.marks[:s.h]) | ||
| } | ||
|
|
||
| copy(out[markOffset+markBytes:], itemsBytes) |
There was a problem hiding this comment.
can you check not empty before write "itemsBytes"? If an user has custom "ItemsSerDe", it is easy to make corrupted sketch
| totalWeightR := 0.0 | ||
| if r > 0 { | ||
| totalWeightR = math.Float64frombits(binary.LittleEndian.Uint64(data[24:])) | ||
| } |
There was a problem hiding this comment.
we need to invalidate that preLong is 4, but r == 0
There was a problem hiding this comment.
Also can you add check totalWeightR is NaN?
| } | ||
|
|
||
| if r == 0 { | ||
| out := &VarOptItemsSketch[T]{ |
There was a problem hiding this comment.
we need to set capacities of each slice. becasue we use cap to determine grow or not.
| } | ||
|
|
||
| // NewVarOptItemsSketchFromSlice deserializes a sketch from bytes. | ||
| func NewVarOptItemsSketchFromSlice[T any](data []byte, serde ItemsSerDe[T]) (*VarOptItemsSketch[T], error) { |
There was a problem hiding this comment.
We need more stricter validation on prelong, h, n, r, k
sampling/varopt_items_union.go
Outdated
| } | ||
|
|
||
| func NewVarOptItemsUnion[T any](maxK int) (*VarOptItemsUnion[T], error) { | ||
| if maxK < 1 || maxK > varOptMaxK { |
There was a problem hiding this comment.
This line is duplicated. VarOptItemsSketch constructor covers it in L49.
sampling/varopt_items_union.go
Outdated
| } | ||
|
|
||
| func (u *VarOptItemsUnion[T]) Reset() error { | ||
| gadget, err := newVarOptItemsSketchAsGadget[T](u.maxK) |
There was a problem hiding this comment.
Why don't you just use "Reset" method in Sketch?
sampling/varopt_items_union.go
Outdated
| }, nil | ||
| } | ||
|
|
||
| func (u *VarOptItemsUnion[T]) MaxK() int { |
There was a problem hiding this comment.
User knows what maxK is. What is purpose on this method?
| assert.True(t, result.IsEmpty()) | ||
| } | ||
|
|
||
| func TestVarOptItemsUnion_UpdateSketchExactMode(t *testing.T) { |
There was a problem hiding this comment.
We need more test cases for update method.
- union sampling sketches which one of them has extreme heavy item.
- union identical sampling sketches.
- union with diffent k and weighted items.
There's a lot of content, but so far all tests have passed.