-
Notifications
You must be signed in to change notification settings - Fork 13
feat: Implemented search annotations for runs #68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat: Implemented search annotations for runs #68
Conversation
| inject_session_dependency(list_pipeline_runs_func) | ||
| ) | ||
| router.post( | ||
| "/api/pipeline_runs/search/", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a piece of feedback on REST semantics mainly. Typically you wouldn't see a POST request for searching for resources, in other words getting resources. If it were a GraphQL API then things would be different.
Here's the suggestion:
Turn this into a GET endpoint:
GET /api/pipeline_runs
This is clear that you get retrieving a list of pipeline runs, which would be an unfiltered, paginated response by default. Then add the search capability after the foundation (unfiltered search) is established.
From loading the UI, I can see there is already a request being made:
curl 'http://localhost:8000/api/pipeline_runs?include_pipeline_names=true&include_execution_stats=true'
and upon asking cursor, I know that it has a pagination implementation already. We can add the filter options to this existing endpoint rather than creating a new endpoint. Which would set a good precedence for all future search capabilities on other resources.
Here is an example of how I've seen this implemented:
curl 'http://localhost:8000/api/pipeline_runs?filter=in(annotations.env:staging,prod)+eq(annotations.app:agenticsearch)
With other operators like gte (greater than or equal), lte, etc.
Accounting for complex AND / OR combinations is something that would require some thought but I wouldn't go there unless that's something we see people using. The above example is a basic OR and AND.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main reason for using POST is that GET requests do not support body. Sending complex query structures via query parameters can be problematic in certain cases. If we can overcome that, then we can use GET piepline_runs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will dig up some documentation on how I've seen this achieved previously then we can review if it meets our needs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are some helpful resources:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we solve this once, we can use the same solution for all future APIs and won't need an extra /search endpoint to maintain
| "key", | ||
| "value", | ||
| ), | ||
| # Index for searching pipeline runs by annotation value only (across all keys) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure such feature would be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine removing this. Reason why I had this is explained in my other longer comment.
|
Thank you for implementing this feature. I think there might slight misunderstanding regarding which predicates are needed. Imagine,
|
|
Another question: Would we be able to reuse the same search classes and functions for component search? |
yuechao-qin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding #68 (comment)
Yes, I plan on reusing this PR's search classes and functions for component search.
Regarding #68 (comment)
Thanks for questioning this Alexey. I found a bug in my code from your concern, which is fix now.
I believe the current implementation does handle your predicates. Let me first address your predicate examples then explain my thought on why I chose this design.
- "key1" in annotations
-
{ "annotation_filters": { "filters": [ {"operator": "equals", "key": "key1"} ], "operator": "and" } }
-
- annotations["key1"] == "value1"
-
{ "annotation_filters": { "filters": [ {"operator": "equals", "key": "key1"}, {"operator": "equals", "value": "value1"} ], "operator": "and" } }
-
- annotations["key1"] in ("value1", "value2")
-
{ "annotation_filters": { "filters": [ {"operator": "equals", "key": "key1"}, {"operator": "in_set", "values": ["value1", "value2"]} ], "operator": "and" } }
-
- "substr1" in annotations["key1"]
-
{ "annotation_filters": { "filters": [ {"operator": "equals", "key": "key1"}, {"operator": "contains", "value": "substr1"} ], "operator": "and" } }
-
- negation of any of those predicates
-
{ "annotation_filters": { "filters": [ {"operator": "equals", "key": "key1", "negate": true}, {"operator": "equals", "value": "value1", "negate": true} ], "operator": "and" } }
-
- and (a root-level predicate with a list of sub-predicates)
-
{ "annotation_filters": { "filters": [ { "filters": [ {"operator": "equals", "key": "key1"}, {"operator": "equals", "value": "value1"} ], "operator": "and" }, { "filters": [ {"operator": "equals", "key": "key2"}, {"operator": "in_set", "values": ["value2", "value3"]} ], "operator": "and" } ], "operator": "and" } }
-
So my thought on why I chose this design is that:
- The backend is simpler because it can combine multiple predicates with different operators.
- Can search for runs with annotations keys and/or values.
- The JSON structure is similar for key and/or value filters, such that it requires:
- search type (key/value)
- operator (equals/contains/in_set)
- can be negated (true/false)
- all predicates can be in groups with logical operators (and/or).
- The JSON structure translates to SQL in a straightforward way. For example, your predicates above will result in the following SQL:
"key1" in annotationsis equivalent to SQLwhere pipeline_run_annotation."key" IS NOT NULLannotations["key1"] == "value1"is equivalent to SQLwhere pipeline_run_annotation."key" = 'key1' AND pipeline_run_annotation.value = 'value1'
- Key motivation from this design is to make sure the SQL is feasible to generate. In addition, to make it as flexible as possible for any combination of predicates.
- Given the current design, it allows quite complex SQL predicates to be generated (groups, keys/values, oeprators, negation).
- The example predicates above you gave are trying to be more pythonic. If we want that UX, it's still possible with this backend design. Since SQL doesn't have a direct transation of pythonic predicates, we need to translate them to SQL, which is something we can explore in the future.
If my explaination does not align with your expectations, happy to change the design. Feel free to suggest any changes you think are better.
| "key", | ||
| "value", | ||
| ), | ||
| # Index for searching pipeline runs by annotation value only (across all keys) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine removing this. Reason why I had this is explained in my other longer comment.
|
I'm curious about |
TODO
EXISTSneeded for searching for Keys? IsEQUALSsufficient?Description
Closes #45
Implemented a new API to search annotations for runs.
Background
Annotations are
key-valuepairs. The following examples of annotations (e.g.key = value):env=productionteam=backendThis PR allows searches for key/value strings.
Features
/api/pipeline_runs/search/) to search key/value in annotations.(S1 and S2) or (S3 or S4)Use Cases and Examples
1. Key equals a string
Find runs where annotation key equals "environment":
{ "annotation_filters": { "filters": [ {"operator": "equals", "key": "environment"} ] } }2. Key contains substring AND value in set
Find runs where key contains "env" AND value is "prod" or "staging":
{ "annotation_filters": { "filters": [ {"operator": "contains", "key": "env"}, {"operator": "in_set", "values": ["prod", "staging"]} ], "operator": "and" } }3. Complex: (key contains OR value contains) AND key NOT contains
Find runs where (key contains "env" OR any value contains "prod") AND key NOT contains "deprecated":
{ "annotation_filters": { "filters": [ { "filters": [ {"operator": "contains", "key": "env"}, {"operator": "contains", "value": "prod"} ], "operator": "or" }, {"operator": "contains", "value": "deprecated", "negate": true} ], "operator": "and" } }Test Plan
uv run pytest tests/test_pipeline_run_search.py -v/api/pipeline_runs/<ID>/annotations/<KEY>/)/api/pipeline_runs/<ID>/annotations/)/api/pipeline_runs/search/)Demo
part1.mov
part2.mov