OTA-1927: Eval cluster update prompts#2908
Conversation
|
@fao89: This pull request references OTA-1927 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository: openshift/coderabbit/.coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (3)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (1)
WalkthroughA new ChangesCluster-Updates Evaluation Setup
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 15✅ Passed checks (15 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@eval/README.md`:
- Around line 54-63: The cluster-updates example commands in the eval/README.md
reference system_cluster_updates.yaml which uses https://localhost:8080, but the
local setup starts OLS at http://localhost:8080, causing a TLS mismatch. Add a
clarifying note in the README near these example commands explaining that for
local runs, users need to either modify the api_base setting in
system_cluster_updates.yaml to use http instead of https, or provide
instructions pointing to a separate local cluster-updates configuration preset
that uses HTTP. This will prevent users from encountering immediate
connection/TLS failures when attempting to run these commands locally.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 8d9c1f51-f25b-40f6-b7f8-eba854c9da4a
📒 Files selected for processing (3)
eval/README.mdeval/eval_data_cluster_updates.yamleval/system_cluster_updates.yaml
Add comprehensive MCP test scenarios to evaluation dataset for validating OpenShift cluster update workflow AI responses. These scenarios establish quality benchmarks for LLM outputs across different update phases. Test Scenarios Added (conv_798-802): - Precheck: Pre-upgrade validation and readiness assessment Comprehensive analysis of cluster health, available updates, and upgrade blockers before initiating updates - Precheck-Specific: Targeted upgrade path validation Validates specific version availability and upgrade feasibility for planned update targets - No-Updates: Cluster health assessment at latest version Health monitoring and operational status when no updates are available in current channel - Progress: Real-time upgrade progress monitoring Tracks upgrade progress with component status, timeline analysis, and ETA calculations during active updates - Troubleshoot: Upgrade failure diagnosis and remediation Root cause analysis and conservative troubleshooting guidance for failed or stuck upgrade scenarios Each scenario includes: - Complete analysis prompts with constraints and requirements - Full ClusterVersion YAML data as attachments - Full ClusterOperator YAML data as attachments - Expected responses with Summary and TL;DR sections - Real cluster data from production-like scenarios These scenarios mirror the CONSOLE-5118 OLS integration workflow phases and provide the evaluation baseline for cluster update AI assistance. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: Fabricio Aguiar <fabricio.aguiar@gmail.com> rh-pre-commit.version: 2.3.2 rh-pre-commit.check-secrets: ENABLED
|
@fao89: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/cc @sriroopar @rioloc |
|
| - Clear recommendation should be provided | ||
| - conversation_group_id: conv_800 | ||
| tag: cluster-updates-scenarios | ||
| turns: |
There was a problem hiding this comment.
Thank you very much for your PR Fabricio,:)
a major bug is that turn metrics is not set up for everyturn which will in turn not capture the metrics we may want to analyze. rest looks okay, dropped a couple minor mismatches in a comment.
Add comprehensive MCP test scenarios to evaluation dataset for validating OpenShift cluster update workflow AI responses. These scenarios establish quality benchmarks for LLM outputs across different update phases.
Test Scenarios Added (conv_798-802):
Precheck: Pre-upgrade validation and readiness assessment Comprehensive analysis of cluster health, available updates, and upgrade blockers before initiating updates
Precheck-Specific: Targeted upgrade path validation Validates specific version availability and upgrade feasibility for planned update targets
No-Updates: Cluster health assessment at latest version Health monitoring and operational status when no updates are available in current channel
Progress: Real-time upgrade progress monitoring Tracks upgrade progress with component status, timeline analysis, and ETA calculations during active updates
Troubleshoot: Upgrade failure diagnosis and remediation Root cause analysis and conservative troubleshooting guidance for failed or stuck upgrade scenarios
Each scenario includes:
These scenarios mirror the CONSOLE-5118 OLS integration workflow phases and provide the evaluation baseline for cluster update AI assistance.
Co-Authored-By: Claude Sonnet 4.5 noreply@anthropic.com
Ref: openshift/console#16131
Summary by CodeRabbit