feat: Implemented search annotations for runs #68

yuechao-qin · 2026-01-13T17:53:20Z

TODO

Discuss if indexing and migrations is needed for annotations (key, value).
How do I confirm that the unit test will be part of CI/CD?
Is EXISTS needed for searching for Keys? Is EQUALS sufficient?
Should we combine user search in the new search API too?
Black formatter, should it align with Google's Python style guide here?

Description

Closes #45

Implemented a new API to search annotations for runs.

Background

Annotations are key-value pairs. The following examples of annotations (e.g. key = value):

env = production
team = backend

This PR allows searches for key/value strings.

Features

Created a new API (POST /api/pipeline_runs/search/) to search key/value in annotations.
Keys searchable operations
- EXISTS: If any key exists regardless of key string
- CONTAINS: If key contains a substring
- IN_SET: If key string matches set of strings
- EQUALS: If key string equals string
Values searchable operations
- CONTAINS: If value contains a substring
- IN_SET: If value matches set of strings
- EQUALS: If value equals string
Keys and Values search operations can be negated (i.e. NOT ).
N searches can be grouped (AND, OR) together and recursive.
- Example: (S1 and S2) or (S3 or S4)

Use Cases and Examples

1. Key equals a string

Find runs where annotation key equals "environment":

{
  "annotation_filters": {
    "filters": [
      {"operator": "equals", "key": "environment"}
    ]
  }
}

2. Key contains substring AND value in set

Find runs where key contains "env" AND value is "prod" or "staging":

{
  "annotation_filters": {
    "filters": [
      {"operator": "contains", "key": "env"},
      {"operator": "in_set", "values": ["prod", "staging"]}
    ],
    "operator": "and"
  }
}

3. Complex: (key contains OR value contains) AND key NOT contains

Find runs where (key contains "env" OR any value contains "prod") AND key NOT contains "deprecated":

{
  "annotation_filters": {
    "filters": [
      {
        "filters": [
          {"operator": "contains", "key": "env"},
          {"operator": "contains", "value": "prod"}
        ],
        "operator": "or"
      },
      {"operator": "contains", "value": "deprecated", "negate": true}
    ],
    "operator": "and"
  }
}

Test Plan

Unit test
- uv run pytest tests/test_pipeline_run_search.py -v
Manual Testing
- Watch demo videos below
- Add annotations for testing (PUT /api/pipeline_runs/<ID>/annotations/<KEY>/)
- Query annotations for ID (GET /api/pipeline_runs/<ID>/annotations/)
- Test with new search (POST /api/pipeline_runs/search/)
Test on Staging (same procedure as Manual Testing above)

Demo

part1.mov

part2.mov

morgan-wowk · 2026-01-19T18:19:58Z

cloud_pipelines_backend/api_router.py

        inject_session_dependency(list_pipeline_runs_func)
    )
+    router.post(
+        "/api/pipeline_runs/search/",


This is a piece of feedback on REST semantics mainly. Typically you wouldn't see a POST request for searching for resources, in other words getting resources. If it were a GraphQL API then things would be different.

Here's the suggestion:

Turn this into a GET endpoint:

GET /api/pipeline_runs

This is clear that you get retrieving a list of pipeline runs, which would be an unfiltered, paginated response by default. Then add the search capability after the foundation (unfiltered search) is established.

From loading the UI, I can see there is already a request being made:

curl 'http://localhost:8000/api/pipeline_runs?include_pipeline_names=true&include_execution_stats=true'

and upon asking cursor, I know that it has a pagination implementation already. We can add the filter options to this existing endpoint rather than creating a new endpoint. Which would set a good precedence for all future search capabilities on other resources.

Here is an example of how I've seen this implemented:

curl 'http://localhost:8000/api/pipeline_runs?filter=in(annotations.env:staging,prod)+eq(annotations.app:agenticsearch)

With other operators like gte (greater than or equal), lte, etc.

Accounting for complex AND / OR combinations is something that would require some thought but I wouldn't go there unless that's something we see people using. The above example is a basic OR and AND.

The main reason for using POST is that GET requests do not support body. Sending complex query structures via query parameters can be problematic in certain cases. If we can overcome that, then we can use GET piepline_runs.

I will dig up some documentation on how I've seen this achieved previously then we can review if it meets our needs

Here are some helpful resources:

Google AIPs

Authorized Buyers API

A rust implementation

Bugsnag / Smartbear

Bold Commerce

I think if we solve this once, we can use the same solution for all future APIs and won't need an extra /search endpoint to maintain

Ark-kun · 2026-01-20T10:44:38Z

cloud_pipelines_backend/backend_types_sql.py

+            "key",
+            "value",
+        ),
+        # Index for searching pipeline runs by annotation value only (across all keys)


I'm not sure such feature would be useful.

I'm fine removing this. Reason why I had this is explained in my other longer comment.

cloud_pipelines_backend/database_ops.py

Ark-kun · 2026-01-20T11:07:44Z

Thank you for implementing this feature.

I think there might slight misunderstanding regarding which predicates are needed.
If I understand correctly, the current implementation treats the annotation key and value separately. However the original issue asks to filter "by values of the keys".

Imagine, annotations is a dict. Then the predicates that we need are:

"key1" in annotations["key1"]
annotations["key1"] == "value1"
annotations["key1"] in ("value1", "value2")
"substr1" in annotations["key1"]
negation of any of those predicates
and (a root-level predicate with a list of sub-predicates)

Ark-kun · 2026-01-20T11:08:35Z

Another question: Would we be able to reuse the same search classes and functions for component search?

yuechao-qin

Regarding #68 (comment)

Yes, I plan on reusing this PR's search classes and functions for component search.

Regarding #68 (comment)

Thanks for questioning this Alexey. I found a bug in my code from your concern, which is fix now.

I believe the current implementation does handle your predicates. Let me first address your predicate examples then explain my thought on why I chose this design.

"key1" in annotations

{
  "annotation_filters": {
    "filters": [
      {"operator": "equals", "key": "key1"}
    ],
    "operator": "and"
  }
}

annotations["key1"] == "value1"

{
  "annotation_filters": {
    "filters": [
      {"operator": "equals", "key": "key1"},
      {"operator": "equals", "value": "value1"}
    ],
    "operator": "and"
  }
}

annotations["key1"] in ("value1", "value2")

{
  "annotation_filters": {
    "filters": [
      {"operator": "equals", "key": "key1"},
      {"operator": "in_set", "values": ["value1", "value2"]}
    ],
    "operator": "and"
  }
}

"substr1" in annotations["key1"]

{
  "annotation_filters": {
    "filters": [
      {"operator": "equals", "key": "key1"},
      {"operator": "contains", "value": "substr1"}
    ],
    "operator": "and"
  }
}

negation of any of those predicates

{
  "annotation_filters": {
    "filters": [
      {"operator": "equals", "key": "key1", "negate": true},
      {"operator": "equals", "value": "value1", "negate": true}
    ],
    "operator": "and"
  }
}

and (a root-level predicate with a list of sub-predicates)

{
  "annotation_filters": {
    "filters": [
      {
        "filters": [
          {"operator": "equals", "key": "key1"},
          {"operator": "equals", "value": "value1"}
        ],
        "operator": "and"
      },
      {
        "filters": [
          {"operator": "equals", "key": "key2"},
          {"operator": "in_set", "values": ["value2", "value3"]}
        ],
        "operator": "and"
      }
    ],
    "operator": "and"
  }
}

So my thought on why I chose this design is that:

The backend is simpler because it can combine multiple predicates with different operators.
- Can search for runs with annotations keys and/or values.
- The JSON structure is similar for key and/or value filters, such that it requires:
  - search type (key/value)
  - operator (equals/contains/in_set)
  - can be negated (true/false)
  - all predicates can be in groups with logical operators (and/or).
- The JSON structure translates to SQL in a straightforward way. For example, your predicates above will result in the following SQL:
  - "key1" in annotations is equivalent to SQL where pipeline_run_annotation."key" IS NOT NULL
  - annotations["key1"] == "value1" is equivalent to SQL where pipeline_run_annotation."key" = 'key1' AND pipeline_run_annotation.value = 'value1'
Key motivation from this design is to make sure the SQL is feasible to generate. In addition, to make it as flexible as possible for any combination of predicates.
- Given the current design, it allows quite complex SQL predicates to be generated (groups, keys/values, oeprators, negation).
The example predicates above you gave are trying to be more pythonic. If we want that UX, it's still possible with this backend design. Since SQL doesn't have a direct transation of pythonic predicates, we need to translate them to SQL, which is something we can explore in the future.

If my explaination does not align with your expectations, happy to change the design. Feel free to suggest any changes you think are better.

yuechao-qin · 2026-01-20T18:00:18Z

cloud_pipelines_backend/backend_types_sql.py

+            "key",
+            "value",
+        ),
+        # Index for searching pipeline runs by annotation value only (across all keys)


I'm fine removing this. Reason why I had this is explained in my other longer comment.

Volv-G · 2026-01-21T00:51:25Z

I'm curious about contains and about whether we always expect annotation values to be strings. I think we might benefit from having arrays there, in which case contains should be checking whether the array contains the term from the query. Can this be supported?

Implemented search annotations for runs

bef4c47

yuechao-qin marked this pull request as ready for review January 14, 2026 00:10

Remove key from ValueFilter

2e69b8f

morgan-wowk reviewed Jan 19, 2026

View reviewed changes

Ark-kun reviewed Jan 20, 2026

View reviewed changes

cloud_pipelines_backend/database_ops.py Outdated Show resolved Hide resolved

Ark-kun reviewed Jan 20, 2026

View reviewed changes

cloud_pipelines_backend/database_ops.py Outdated Show resolved Hide resolved

yuechao-qin added 2 commits January 20, 2026 09:48

Run black formatter and added configuration to the pyproject.toml

87e12b2

Fixed the group logic so that only 1 SQL EXISTS appear per group query

ac77dde

yuechao-qin closed this Jan 20, 2026

yuechao-qin reopened this Jan 20, 2026

yuechao-qin commented Jan 20, 2026

View reviewed changes

yuechao-qin requested a review from Ark-kun January 20, 2026 19:41

feat: Implemented search annotations for runs #68

Are you sure you want to change the base?

feat: Implemented search annotations for runs #68

Uh oh!

Conversation

yuechao-qin commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TODO

Description

Background

Features

Use Cases and Examples

1. Key equals a string

2. Key contains substring AND value in set

3. Complex: (key contains OR value contains) AND key NOT contains

Test Plan

Demo

Uh oh!

morgan-wowk Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Ark-kun Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

morgan-wowk Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

morgan-wowk Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

morgan-wowk Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Ark-kun Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

yuechao-qin Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Ark-kun commented Jan 20, 2026

Uh oh!

Ark-kun commented Jan 20, 2026

Uh oh!

yuechao-qin left a comment

Choose a reason for hiding this comment

Uh oh!

yuechao-qin Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Volv-G commented Jan 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yuechao-qin commented Jan 13, 2026 •

edited

Loading

morgan-wowk Jan 19, 2026 •

edited

Loading