INTEROP-8976, INTEROP-8979: Token expiry alerts and per-rule Slack notifications#274
Draft
amp-rh wants to merge 5 commits intoRedHatQE:mainfrom
Draft
INTEROP-8976, INTEROP-8979: Token expiry alerts and per-rule Slack notifications#274amp-rh wants to merge 5 commits intoRedHatQE:mainfrom
amp-rh wants to merge 5 commits intoRedHatQE:mainfrom
Conversation
Jira Cloud cannot programmatically create or rotate API tokens, so fully automated rotation is not possible with basic auth. Instead, add tooling for scheduled expiry monitoring with manual rotation: - check-token-expiry.py: standalone script that reads expires_at from Vault and posts Slack alerts at 30/14/7/3/1 days before expiry - prow-periodic.yaml: reference Prow periodic job config (daily 08:00 UTC) - docs/vault-schema.md: Vault KV secret schema with new expires_at field - docs/rotation-runbook.md: step-by-step manual rotation procedure Relates: https://issues.redhat.com/browse/INTEROP-8976 Made-with: Cursor
|
Skipping CI for Draft Pull Request. |
5 tasks
Add optional slack_channel field to firewatch config rules. When set, firewatch sends a Slack notification after creating a new Jira issue, updating a duplicate, or filing a success story. Absent or empty value skips notification. Supports !default to read from env var. Changes: - Rule: add _get_slack_channel with !default/$FIREWATCH_DEFAULT_SLACK_CHANNEL - SlackClient: add post_webhook static method for webhook-based posting - Configuration: accept slack_bot_token and slack_webhook_url params - Report: send Slack after issue creation, duplicate comment, success - CLI: add --slack-bot-token and --slack-webhook-url to report command Relates: https://issues.redhat.com/browse/INTEROP-8979 Made-with: Cursor
check-token-expiry.py now uses SlackClient.post_webhook() instead of a local slack_post() function. post_webhook() propagates errors so callers can decide how to handle failures: the token-expiry script surfaces them as exit codes while Report._notify_slack() logs and continues. Made-with: Cursor
- Add types-requests to mypy pre-commit additional_dependencies - Fix bare URLs and emphasis-as-heading in rotation-runbook.md and vault-schema.md - Accept ruff-format changes and add mypy type: ignore on dict lookups Made-with: Cursor
Findings from staging walkthrough: - vault kv put destroys 7 undocumented fields (secretsync config, account credentials, staging token); switch to vault kv patch - Document all 10 actual Vault secret fields, not just 3 - Add staging verification step (stage-redhat.atlassian.net) - Note that API tokens are account-scoped, not instance-scoped - Specify vault login -method=oidc for authentication - Include access_token_msi in rotation patch command Made-with: Cursor
Collaborator
Author
|
/test all |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two related features unified under a shared Slack notification infrastructure.
Token expiry alerts (INTEROP-8976)
Jira Cloud does not support programmatic API token rotation. This PR adds tooling for scheduled expiry monitoring with manual rotation:
check-token-expiry.py: Readsexpires_atfrom the Vault KV secret and posts Slack alerts at 30/14/7/3/1 days before expiry. Exit code 2 if expired. UsesSlackClient.post_webhook()from the shared Slack infrastructure.prow-periodic.yaml: Reference Prow periodic job config (daily at 08:00 UTC). To be merged into openshift/release once approved.docs/vault-schema.md: Full Vault KV secret schema documenting all 10 fields (5 managed by rotation, 5 that must not be modified). Warns againstvault kv putin favor ofvault kv patch.docs/rotation-runbook.md: Step-by-step manual rotation procedure with staging verification, rollback, and troubleshooting. Tested againststage-redhat.atlassian.net.Per-rule Slack notifications (INTEROP-8979)
Adds optional
slack_channelfield to firewatch config rules. When set, firewatch sends a Slack notification after creating or updating a Jira issue. Empty or absent value skips notification. Supports!defaultto read from$FIREWATCH_DEFAULT_SLACK_CHANNEL.Changed files:
src/objects/rule.py:slack_channelfield with!default/ env var patternsrc/objects/slack_base.py:post_webhook()static method for Slack incoming webhookssrc/commands/report.py:--slack-bot-tokenand--slack-webhook-urlCLI optionssrc/objects/configuration.py: Stores Slack credentialssrc/report/report.py:_notify_slack(),_slack_new_issue(),_slack_duplicate(),_slack_success()hooksConfig example:
{ "failure_rules": [ { "step": "install", "failure_type": "pod_failure", "classification": "Infrastructure", "jira_project": "LPINTEROP", "slack_channel": "#ocp-ci-firewatch-tool" } ] }Shared infrastructure
Both features post to Slack via
SlackClient.post_webhook(). The token-expiry script previously had its ownslack_post()function; it now uses the shared method. Errors propagate to callers: the expiry script surfaces them as exit codes while Report notifications log and continue.Vault schema change
The Vault secret at
kv/selfservice/firewatch-tool/jira-credentialscontains 10 fields. Five are managed during token rotation:emailfirewatch@redhat.com)access_tokenaccess_token_msiaccess_token(kept in sync)access_token_stageexpires_atThe remaining 5 fields (account credentials, secretsync config) must not be modified during rotation. Always use
vault kv patch, nevervault kv put, to avoid destroying these fields.Doc updates from staging walkthrough
The rotation runbook and vault schema were validated against the Jira staging environment (
stage-redhat.atlassian.net). Key fixes applied:vault kv puttovault kv patch(aputwould destroy 7 undocumented fields)vault login -method=oidcfor Vault authenticationaccess_token_msito the rotation patch commandTest plan
check-token-expiry.pyfor correctness (Vault read, date parsing, shared Slack webhook)prow-periodic.yamlis valid Prow configdocs/vault-schema.md(all 10 fields documented,patchnotput)docs/rotation-runbook.md(staging verification, OIDC auth, patch command)report.pyslack_channelfield parsing inrule.pyslack_channelin team configsRelates: https://issues.redhat.com/browse/INTEROP-8976
Relates: https://issues.redhat.com/browse/INTEROP-8979