Skip to content

Conversation

@Totodore
Copy link

@Totodore Totodore commented Jan 11, 2026

Description

The Tantivy OwnedValue enum used in the TopHitsVecEntry was triggering a postcard WontImplement error.

Redefine OwnedValue with support for postcard de-serialization as well as TopHits result types.

Testing

Top hits scenario is ok with quickwit and elasticsearch

(rest-api-tests) ➜  rest-api-tests git:(fix-tophits-agg-deserialization) ✗ ./run_tests.py --test scenarii/es_compatibility/0032-top_hits.yaml 
Namespace(engine='quickwit', test=['scenarii/es_compatibility/0032-top_hits.yaml'], binary=None)
Filtering tests prefixes: ['scenarii/es_compatibility/0032-top_hits.yaml']
============
============
============
🟢 scenarii/es_compatibility/_setup.quickwit.yaml: 8 steps (0 skipped)
🟢 scenarii/es_compatibility/0032-top_hits.yaml: 1 steps (0 skipped)
🟢 scenarii/es_compatibility/_teardown.quickwit.yaml: 7 steps (0 skipped)
(rest-api-tests) ➜  rest-api-tests git:(fix-tophits-agg-deserialization) ✗ ./run_tests.py --engine quickwit --test scenarii/es_compatibility/0032-top_hits.yaml
Namespace(engine='quickwit', test=['scenarii/es_compatibility/0032-top_hits.yaml'], binary=None)
Filtering tests prefixes: ['scenarii/es_compatibility/0032-top_hits.yaml']
============
============
============
🟢 scenarii/es_compatibility/_setup.quickwit.yaml: 8 steps (0 skipped)
🟢 scenarii/es_compatibility/0032-top_hits.yaml: 1 steps (0 skipped)
🟢 scenarii/es_compatibility/_teardown.quickwit.yaml: 7 steps (0 skipped)
(rest-api-tests) ➜  rest-api-tests git:(fix-tophits-agg-deserialization)

@Totodore Totodore force-pushed the fix-tophits-agg-deserialization branch from 3a3412f to d82dfa0 Compare January 11, 2026 14:01
@rdettai-sk
Copy link
Collaborator

This looks like a reasonable fix. To make sure we won't see regressions and ensure compatibility with ES it would great add an integration test (./quickwit/rest-api-tests/scenarii/es_compatibility). You'll see in the readme there how to run those against QW and ES.

@Totodore
Copy link
Author

@rdettai-sk I found multiple bugs while testing but it seems to be more on the tantivy side, mostly coming from here:

https://github.com/quickwit-oss/tantivy/blob/12977bc7c4cfa6e97936e8e0ecd871183bced2a4/src/aggregation/metric/top_hits.rs#L221-L242

  • if one of the field in docvalue_fields doesn't exists, this function panics because the initial checks for non-globs types assumes that the field exists.
  • if one of the field in docvalue_fields exists but is a dynamic fast field, the function also panics because it doesn't find the appropriate column, it seems that their is a particular naming for dynamic fields in tantivy and that our field is not cast. Because it doesn't find the field it falls into the first point and panics.
[git/checkouts/tantivy-9cf9af06161d52ce/d904630/src/aggregation/metric/top_hits.rs:226:42] name.as_str() = "_dynamic\u{1}type"
[git/checkouts/tantivy-9cf9af06161d52ce/d904630/src/aggregation/metric/top_hits.rs:226:65] field = "type"

Can we still merge this and I'll try to propose fixes on the tantivy repo?

@Totodore
Copy link
Author

Because quickwit is prepending _dynamic. prefix to all dynamic values, we should add this prefix to dynamic docvalue_fields in the user request, so it matches the _dynamic.{field} tantivy schema. It might be the subject of another PR though.

@Totodore
Copy link
Author

Totodore commented Jan 14, 2026

The top hits result is still not ES compatible, got the test wrong at the first place. I still have issues to make a request that is fully compatible with the quickwit response.

@rdettai-sk
Copy link
Collaborator

rdettai-sk commented Jan 17, 2026

Yes, the currently returned format is not compatible with elastic, and I think that's not good.

Es returns

      "buckets": [
        {
          "key": "1maria",
          "doc_count": 1,
          "recent_events": {
            "hits": {
              "total": {
                "value": 1,
                "relation": "eq"
              },
              "max_score": null,
              "hits": [
                  ...hits objects...

But QW returns:

      "buckets": [
        {
          "key": "1maria",
          "doc_count": 1,
          "recent_events": {
            "hits": [
                  ...hits objects...
            ]

Or stated otherwise, hits in TopHitsMetricResult should be a HitsMetadata and not Vec<Hit>.

@Totodore
Copy link
Author

@rdettai-sk I'll ensure the format is ES compatible when refactoring the TopHits aggregation in my Tantivy's PR.

@rdettai-sk
Copy link
Collaborator

  • It is not necessarily Tantivy's concern to be compatible with ES. You could keep the current output in Tantivy and make it ES compatible in Quickwit's TopHitsMetricResult
  • I would rather not merge with the es_compat tests expecting something that is not compatible with ES

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants