Skip to content

summary: better handle job cancellation [WIP]#155

Draft
glimchb wants to merge 1 commit intospdk:mainfrom
glimchb:cancellations
Draft

summary: better handle job cancellation [WIP]#155
glimchb wants to merge 1 commit intospdk:mainfrom
glimchb:cancellations

Conversation

@glimchb
Copy link
Copy Markdown
Contributor

@glimchb glimchb commented Feb 16, 2026

move logic about jobs failure or success
inside the summary job.

Sometimes if cacnellation happend too late,
the summary will still run even that we stated

if: ${{ !cancelled() }}

this patch handles those rare cases and votes=0

we could skip vote at all, but I like this for now so we can debug better for near future.

move logic about jobs failure or success
inside the summary job.

Sometimes if cacnellation happend too late,
the summary will still run even that we stated
```
if: ${{ !cancelled() }}
```

this patch handles those rare cases and votes=0

we could skip vote at all, but I like this for now
so we can debug better for near future.

Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com>
@glimchb glimchb changed the title summary: better handle job cancellation summary: better handle job cancellation [WIP] Feb 16, 2026
Comment on lines 79 to +82
run: |
vote="-1"
message="Build failed. "

# Combine results from jobs and autorun_post.py
if [[ "${{ inputs.result }}" == "success" && \
"${{ needs.merge_outputs.outputs.result }}" == "success" ]]; then
set -x
if [[ "${{ inputs.result_common }}" == "cancelled" || "${{ inputs.result_hpe }}" == "cancelled" ]]; then
vote="0"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For cases where the common-tests were cancelled, leaving a message on the patch would be confusing for the submitter. Please remember that the actual result could be posted much later, leaving folks unsure how to proceed with such message.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nice to know that job was cancelled. we can post more actionable message here,
for example due to timeout or due to a push of new version of the patch for clarity.
at least for now posting always something could be useful for debugging and then we can re-consider to lower the verbosity once we confident with the long term of this or it ends up being too noisy...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already do not post anything for any of cancel conditions in https://github.com/spdk/spdk-ci/blob/main/.github/scripts/parse_gerrit_webhook.sh.

There is nothing actionable that a patch submitter can or should do. All GH cancellations are just result of patch state during handling of Gerrit event.

if [[ "${{ inputs.result }}" == "success" && \
"${{ needs.merge_outputs.outputs.result }}" == "success" ]]; then
set -x
if [[ "${{ inputs.result_common }}" == "cancelled" || "${{ inputs.result_hpe }}" == "cancelled" ]]; then
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition will evaluate true when a job within either workflow times out. For such case we still need to post a failure.

I'm not sure how to differ the cancelled state due to concurrency and timeout. The autorun and pkgdep jobs would have to change to somehow track that.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can use REST API according to documentation to determine specifically timeouts

https://docs.github.com/en/rest/guides/using-the-rest-api-to-interact-with-checks?utm_source=chatgpt.com&apiVersion=2022-11-28

# Summary should be last job, please make sure to include yours in 'needs'.
summary:
name: Job summary
if: ${{ !cancelled() }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would adding check for needs.common.result != 'cancelled' work here ?
We know that !cancelled() works for patch_set_status which consistently cancels the gerrit-webhook-handler.yml workflow, skipping summary. I'm guessing that due to common and hpe are reusable workflows, they are not caught with the same condition.

[note the timeout concern below]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants