Hello! We noticed a significant increase in outbound bandwidth on Tuscolo, investigated a bit, and concluded that Certstream is requesting partial tiles (the ones at the end of the log) disproportionally. I'm wondering if this is something we could work together to improve.
This is a 7d graph of http_requests_total across our logs. with-github is basically all github.com/d-Rickyy-b/certstream-server-go, because the other heavy-hitters all match with-email, with-url, or anonymous.
Filtering for with-github requests, and bucketing them by kind, we see they are almost all partial tiles.
The https://c2sp.org/static-ct-api spec says
A client, such as a Monitor, that “tails” a log SHOULD, as an optimization, avoid fetching partial data tiles when possible. If applying this optimization, the client MUST fetch the corresponding partial “level 0” tile, and use that to verify the current checkpoint. When fetching a subsequent checkpoint, the client MUST verify its consistency with the current checkpoint. If a data tile remains partial for too long (as defined by client policy), the client MUST fetch it to prevent delaying entries from being processed indefinitely.
The filippo.io/sunlight.Client.Entries implementation does that by not fetching the partial at all, unless the log grows by less than a full tile in a cycle.
https://pkg.geomys.dev/filippo.io/torchwood@v0.9.0/tlogclient.go#L139-L148
It looks like Certstream polls every second (which is a very aggressive default) and always fetches the partial.
Did anything change in the logic or defaults that would explain the qps jump?
Would you consider adopting the filippo.io/sunlight logic?
Thank you!
Hello! We noticed a significant increase in outbound bandwidth on Tuscolo, investigated a bit, and concluded that Certstream is requesting partial tiles (the ones at the end of the log) disproportionally. I'm wondering if this is something we could work together to improve.
This is a 7d graph of http_requests_total across our logs. with-github is basically all github.com/d-Rickyy-b/certstream-server-go, because the other heavy-hitters all match with-email, with-url, or anonymous.
Filtering for with-github requests, and bucketing them by kind, we see they are almost all partial tiles.
The https://c2sp.org/static-ct-api spec says
The filippo.io/sunlight.Client.Entries implementation does that by not fetching the partial at all, unless the log grows by less than a full tile in a cycle.
https://pkg.geomys.dev/filippo.io/torchwood@v0.9.0/tlogclient.go#L139-L148
It looks like Certstream polls every second (which is a very aggressive default) and always fetches the partial.
Did anything change in the logic or defaults that would explain the qps jump?
Would you consider adopting the filippo.io/sunlight logic?
Thank you!