OCPERT-347 Automate active release discovery for slash command and dashboard#979
OCPERT-347 Automate active release discovery for slash command and dashboard#979LuboTerifaj wants to merge 9 commits intoopenshift:mainfrom
Conversation
Enable auto-discovery for Release Progress Dashboard - Add discover_active_releases MCP tool - Add auto-load button to dashboard for discovering active releases - Update active-releases slash command to use discovery tool
- Move logic from MCP server to oar.core.release_discovery module - Use GitHub API with authentication for discovery and ConfigStore for date filtering - Add unit tests with mocked GitHub API and ConfigStore Used by MCP server to power Release Progress Dashboard and Claude Code slash command
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a GitHub-backed ReleaseDiscovery class and exception, a new MCP tool exposing active z-stream releases, client/data-collector wrappers, Streamlit UI integration to load discovered releases, unit tests for discovery logic, and a command-spec documentation file. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.claude/commands/release/active-releases.md:
- Around line 235-240: Update the command notes to reflect the implemented flow:
change the date filtering note to say releases are kept through release_date + 1
day (not 2 days), remove or amend the claim that only oar_get_release_status()
is used per release and instead note that the status logic also calls
oar_is_release_shipped() to determine SHIPPED, and update the
MCP-calls/parallel-execution note to indicate that more than one MCP call may be
used per release (oar_get_release_status() plus oar_is_release_shipped()) while
still executing calls in parallel for performance.
In `@mcp_server/server.py`:
- Around line 1333-1339: The handler currently constructs
ReleaseDiscovery(configstore_factory=get_cached_configstore) and calls
discovery.get_active_releases() directly (blocking IO) on the FastMCP event
loop; extract that blocking work into a synchronous helper (e.g.,
_sync_get_active_releases or similar) that creates ReleaseDiscovery and returns
the active releases, then call it from the async handler via the thread pool
using asyncio.get_running_loop().run_in_executor(None,
_sync_get_active_releases) (mirroring
invoke_oar_command_async()/invoke_cli_command_async() patterns) and return
json.dumps({"releases": result}) from the awaiting coroutine.
In `@oar/core/release_discovery.py`:
- Around line 219-235: _is_release_active currently only catches
ConfigStoreException, so a malformed date from ConfigStore.get_release_date can
raise ValueError and abort discovery; update _is_release_active to additionally
catch ValueError (and optionally TypeError) around the datetime.strptime
parsing, log a warning including the release id and the offending
release_date_str, and return False so get_active_releases() skips that release
instead of failing the whole run.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8ca824bc-6379-4361-babe-9340c78d6bfd
📒 Files selected for processing (7)
.claude/commands/release/active-releases.mdmcp_server/server.pyoar/core/exceptions.pyoar/core/release_discovery.pytests/test_release_discovery.pytools/mcp_data_collector.pytools/release_progress_dashboard/release_progress_dashboard.py
| - **Auto-discovery**: Automatically finds releases from tracking files (no hardcoded lists) | ||
| - **Date filtering**: Excludes releases with release_date older than 2 days | ||
| - **StateBox required**: Only shows releases with StateBox (ready for QE work) | ||
| - **Timing**: Releases appear 1-2 days after cut-off when ART creates StateBox | ||
| - **Single MCP call per release**: Only `oar_get_release_status()` - contains all needed data | ||
| - **Parallel execution**: Call MCP tools in parallel for all releases for performance |
There was a problem hiding this comment.
Align the command notes with the implemented flow.
This footer says discovery excludes releases older than 2 days and that only oar_get_release_status() is needed per release. The new implementation keeps releases only through release_date + 1 day, and the status logic above still depends on oar_is_release_shipped() for SHIPPED.
Proposed fix
-- **Date filtering**: Excludes releases with release_date older than 2 days
+- **Date filtering**: Excludes releases more than 1 day past `release_date`
- **StateBox required**: Only shows releases with StateBox (ready for QE work)
- **Timing**: Releases appear 1-2 days after cut-off when ART creates StateBox
-- **Single MCP call per release**: Only `oar_get_release_status()` - contains all needed data
+- **Per-release MCP calls**: Use `oar_get_release_status()` for task/issue data and `oar_is_release_shipped()` for live shipment state
- **Parallel execution**: Call MCP tools in parallel for all releases for performance📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - **Auto-discovery**: Automatically finds releases from tracking files (no hardcoded lists) | |
| - **Date filtering**: Excludes releases with release_date older than 2 days | |
| - **StateBox required**: Only shows releases with StateBox (ready for QE work) | |
| - **Timing**: Releases appear 1-2 days after cut-off when ART creates StateBox | |
| - **Single MCP call per release**: Only `oar_get_release_status()` - contains all needed data | |
| - **Parallel execution**: Call MCP tools in parallel for all releases for performance | |
| - **Auto-discovery**: Automatically finds releases from tracking files (no hardcoded lists) | |
| - **Date filtering**: Excludes releases more than 1 day past `release_date` | |
| - **StateBox required**: Only shows releases with StateBox (ready for QE work) | |
| - **Timing**: Releases appear 1-2 days after cut-off when ART creates StateBox | |
| - **Per-release MCP calls**: Use `oar_get_release_status()` for task/issue data and `oar_is_release_shipped()` for live shipment state | |
| - **Parallel execution**: Call MCP tools in parallel for all releases for performance |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.claude/commands/release/active-releases.md around lines 235 - 240, Update
the command notes to reflect the implemented flow: change the date filtering
note to say releases are kept through release_date + 1 day (not 2 days), remove
or amend the claim that only oar_get_release_status() is used per release and
instead note that the status logic also calls oar_is_release_shipped() to
determine SHIPPED, and update the MCP-calls/parallel-execution note to indicate
that more than one MCP call may be used per release (oar_get_release_status()
plus oar_is_release_shipped()) while still executing calls in parallel for
performance.
rioliu-rh
left a comment
There was a problem hiding this comment.
Thanks for implementing the active release discovery feature! The overall approach is solid, but there are 3 critical performance and correctness issues that need to be addressed before merging.
Summary of Issues
1. 🔴 CRITICAL: Expensive ConfigStore Initialization (line 226)
- Problem: Creates ConfigStore for each release (~1000ms each) just to read
release_date - Impact: Adds 10+ seconds overhead for 10 y-streams
- Fix: Query StateBox YAML directly (10x faster)
2. 🔴 CRITICAL: Incorrect Semver Parsing (line 154)
- Problem: Uses naive
int()conversion - breaks on pre-release tags (e.g.,4.20.0-rc1) - Impact: Crashes on certain version formats, inconsistent with
detector.pypatterns - Fix: Use
sorted()withsemver.VersionInfo.parse()(matches detector.py:34-37 pattern)
3. 🟡 Performance: Sequential REST API Calls (line 195)
- Problem: Makes 11+ sequential REST calls to discover y-streams
- Impact: 3-8 seconds of network latency
- Fix: Use GraphQL to batch-fetch directory structure in 1 call
4. 🟡 Future-proofing: Hardcoded OCP 4.x Regex (line 102)
- Problem: Only supports
4.\d{1,2}pattern - Impact: Breaks when OCP 5.x is released
- Fix: Use
\d+.\d{1,2}to support any major version
Performance Impact
| Component | Before | After | Improvement |
|---|---|---|---|
| Y-stream discovery | 11 REST calls (3-8s) | 1 GraphQL call (0.5s) | ~6-16x faster |
| Release date lookup | 10 ConfigStore inits (10s) | 10 StateBox reads (1s) | 10x faster |
| Total Time | 13-18 seconds | 2-3 seconds | ~8x faster ✅ |
Action Items
Please address the inline comments with code examples. All fixes follow existing patterns in the codebase:
- Semver usage → see
oar/controller/detector.py:34-37(sorted approach) anddetector.py:62-65(VersionInfo) - StateBox querying → see
oar/core/statebox.py - GraphQL queries → tested and verified to work with GitHub API
Let me know if you have any questions about the proposed solutions!
oar/core/release_discovery.py
Outdated
| contents = self.repo.get_contents(self.RELEASES_PATH, ref=self.branch) | ||
|
|
||
| # Extract y-stream versions from directory names | ||
| y_streams = [] |
There was a problem hiding this comment.
Issue: Hardcoded regex only supports OCP 4.x versions
Why this is a problem:
This regex pattern ^4\.\d{1,2}$ is hardcoded to match only versions starting with "4." (e.g., 4.12, 4.20). When OpenShift releases 5.x or 10.x versions in the future, this code will break and fail to discover those releases.
Solution:
Make the regex future-proof by supporting any major version:
# Support any major version: 4.x, 5.x, 10.x, etc.
pattern = re.compile(r'^\d+\.\d{1,2}$')This matches the same pattern used in oar/controller/detector.py:97 but is more future-proof.
There was a problem hiding this comment.
Yeah, good catch! Thanks.
oar/core/release_discovery.py
Outdated
|
|
||
| for y_stream in y_streams: | ||
| latest_release = self.get_latest_release_for_ystream(y_stream) | ||
| if not latest_release: |
There was a problem hiding this comment.
Issue: Inefficient API usage - sequential REST calls and redundant logic
Why this is a problem:
Current implementation has two inefficiencies:
- Sequential REST calls: Makes 11+ sequential REST calls (1 for
_releases/+ 10 for each4.X.z.yaml) - Redundant StateBox listing: After getting latest release from tracking file, the current approach would list StateBox directory again to find the same release
Solution:
Use PyGithub's native GraphQL support to batch-fetch tracking files, then directly access StateBox:
from semver import VersionInfo
def get_active_releases(self, keep_days_after_release: int = 1) -> List[str]:
"""Fast discovery using GraphQL to batch-fetch tracking files"""
# Use PyGithub's native GraphQL support (no requests library needed)
query = """
query {
repository(owner: "openshift", name: "release-tests") {
object(expression: "z-stream:_releases") {
... on Tree {
entries {
name
type
object {
... on Tree {
entries {
name
type
object {
... on Blob {
text
}
}
}
}
}
}
}
}
}
}
"""
# Use PyGithub's requestJsonAndCheck (native GraphQL support)
headers, data = self._github._Github__requester.requestJsonAndCheck(
"POST",
"/graphql",
input={"query": query}
)
active_releases = []
pattern = re.compile(r'^\d+\.\d{1,2}$') # Future-proof: 4.x, 5.x, 10.x
# Process each y-stream directory
for y_stream_dir in data['data']['repository']['object']['entries']:
if y_stream_dir['type'] != 'tree':
continue
y_stream = y_stream_dir['name']
if not pattern.match(y_stream):
continue
# Find the tracking file: {y-stream}.z.yaml
tracking_file = f"{y_stream}.z.yaml"
for file_entry in y_stream_dir['object']['entries']:
if file_entry['name'] == tracking_file and file_entry['type'] == 'blob':
try:
# Parse YAML content from GraphQL response
yaml_content = file_entry['object']['text']
tracking_data = yaml.safe_load(yaml_content)
# Get all releases and find latest using semver
releases = list(tracking_data.get("releases", {}).keys())
if not releases:
continue
# Sort using semver (like detector.py:34-37)
sorted_releases = sorted(releases, key=lambda v: VersionInfo.parse(v))
latest = sorted_releases[-1]
# Check if active (queries StateBox directly - see line 226 comment)
if self._is_release_active(latest, keep_days_after_release):
active_releases.append(latest)
except Exception as e:
logger.debug(f"Error processing y-stream {y_stream}: {e}")
break
logger.info(f"Discovered {len(active_releases)} active releases: {active_releases}")
return active_releasesKey improvements:
- PyGithub native GraphQL: Uses
requestJsonAndCheck()(norequestslibrary import) - Single API call: Fetches all tracking files in one GraphQL query
- No redundant StateBox listing: Directly fetches StateBox by filename (handled in
_is_release_active()) - Performance: 11+ REST calls (~3-8s) → 1 GraphQL call (~0.5s) = 6-16x faster
There was a problem hiding this comment.
Good point. I will update it to use GraphQL.
|
|
||
| if not releases: | ||
| logger.debug(f"No releases found for y-stream {y_stream}") | ||
| return None |
There was a problem hiding this comment.
Issue: Incorrect semver parsing using naive int() conversion
Why this is a problem:
This approach tuple(map(int, v.split('.'))) is problematic:
- Fails for pre-release tags: Crashes on versions like
4.20.0-rc1(can't convert'0-rc1'to int) - Doesn't follow semver rules: Naive integer comparison doesn't handle semantic versioning correctly
- Inconsistent with codebase:
oar/controller/detector.py:62-65usesVersionInfo.parse()for version comparison
Solution:
Use sorted() with semver.VersionInfo (matches detector.py pattern):
from semver import VersionInfo
def get_latest_release_for_ystream(self, y_stream: str) -> Optional[str]:
# ... (fetch releases code) ...
if not releases:
logger.debug(f"No releases found for y-stream {y_stream}")
return None
# Sort using semver (similar to detector.py:34-37, but with proper semver)
sorted_releases = sorted(releases, key=lambda v: VersionInfo.parse(v))
latest = sorted_releases[-1]
logger.debug(f"Latest release for {y_stream}: {latest}")
return latestWhy sorted() instead of max():
Following the pattern in detector.py:34-37 which uses sorted() and takes the last element. Both work correctly, but sorted() is more explicit and matches existing codebase patterns.
There was a problem hiding this comment.
Agree on using semver, especially for possible future cases if we support versions beyond z-streams (e.g., pre-release tags like rc, ec).
However, I'm using max() instead of sorted() because:
- More efficient: O(n) vs O(n log n)
- Clearer intent: we only need the maximum, not a sorted list
- More Pythonic:
max()is the standard idiom for "find largest"
|
|
||
| # Parse date in format: 2026-Mar-25 | ||
| release_date = datetime.strptime(release_date_str, "%Y-%b-%d").date() | ||
| today = datetime.now().date() |
There was a problem hiding this comment.
CRITICAL Issue: Expensive ConfigStore initialization
Why this is a problem:
Creating a ConfigStore instance for each release is extremely expensive:
- Each ConfigStore init costs ~1000ms (JWE decryption + GitHub HTTP request + YAML parsing)
- For 10 y-streams, this adds 10+ seconds of overhead
- ConfigStore data is immutable, but we only need one field:
release_date - Even with MCP server caching, cache misses are very expensive
Solution:
Query release_date directly from StateBox YAML instead:
def _is_release_active(self, release: str, keep_days: int) -> bool:
"""
Check if release is within date window (release_date + keep_days).
Queries StateBox YAML directly instead of creating expensive ConfigStore instance.
Args:
release: Release version (e.g., "4.20.17")
keep_days: Number of days after release_date to keep visible
Returns:
True if release is active, False otherwise
"""
try:
# Extract y-stream from release (e.g., "4.20.17" -> "4.20")
y_stream = '.'.join(release.split('.')[:2])
statebox_path = f"{self.RELEASES_PATH}/{y_stream}/statebox/{release}.yaml"
# Fetch StateBox file
file_content = self.repo.get_contents(statebox_path, ref=self.branch)
statebox_data = yaml.safe_load(file_content.decoded_content.decode('utf-8'))
# Extract release_date from metadata
release_date_str = statebox_data['metadata']['release_date']
# Parse date (format: "2026-Mar-25")
release_date = datetime.strptime(release_date_str, "%Y-%b-%d").date()
today = datetime.now().date()
is_active = today <= release_date + timedelta(days=keep_days)
logger.debug(
f"Release {release} {'active' if is_active else 'past visibility window'} "
f"(release_date: {release_date_str})"
)
return is_active
except Exception as e:
# StateBox not ready for this release (build data not available yet)
logger.debug(f"StateBox not available for {release}: {e}")
return FalsePerformance improvement: 10 ConfigStore inits (~10s) → 10 StateBox reads (~1s) = ~10x faster
Example StateBox structure:
schema_version: '1.0'
release: 4.20.17
metadata:
release_date: 2026-Mar-25 # <-- This is what we need
jira_ticket: ART-14325
advisory_ids:
rpm: 157165There was a problem hiding this comment.
I considered several approaches initially - each has some issues:
- ConfigStore: New cache every 7 days per version - not ideal
- StateBox class: Calls ConfigStore in constructor, same problem
- Direct REST calls: N+1 query issue
Per your suggestion, I explored GraphQL, which gives us two options:
- Two queries (tracking + specific StateBox files): ~186 KB, only what we need
- One query (all StateBox files): Simpler, but ~1,192 KB (84% waste)
Going with two queries as it is more efficient (fetches only 10 needed files) and scales better as the amount of Statebox files grows.
| st.sidebar.success(f"✅ Added {added_count} active release(s)") | ||
| st.rerun() | ||
| else: | ||
| st.sidebar.info("All active releases already tracked") |
There was a problem hiding this comment.
Issue: Misleading message when no active releases found
Why this is a problem:
When discovery returns an empty list (e.g., during holidays when no releases are active), the current message says:
"All active releases already tracked"
This is misleading because it implies active releases exist but are already tracked, when the reality is that no active releases were found.
Solution:
Distinguish between two scenarios:
if added_count > 0:
st.sidebar.success(f"✅ Added {added_count} active release(s)")
st.rerun()
else:
# Distinguish: no releases found vs all already tracked
if len(active_releases) == 0:
st.sidebar.info("ℹ️ No active releases found")
else:
st.sidebar.info(f"All {len(active_releases)} active release(s) already tracked")Why this matters:
- Clear communication: User understands discovery ran but found nothing
- Avoids confusion: Doesn't suggest releases exist when they don't
- Future-proof: Message not tied to hardcoded
keep_days_after_release=1parameter
There was a problem hiding this comment.
Yeah, include the number of releases is a good idea to avoid confusion.
- Remove ConfigStore dependency, fetch StateBox/tracking files directly via GraphQL - Replace REST API calls with GraphQL batch queries - Fix regex pattern for future-proof y-stream matching - Fix semver parsing for rc/ec version handling - Refine dashboard message
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
♻️ Duplicate comments (2)
oar/core/release_discovery.py (1)
445-461:⚠️ Potential issue | 🟠 MajorHandle non-mapping StateBox payloads without aborting discovery.
This now skips bad date strings, but Line 447 still assumes both
release_dataandmetadataare dicts. A malformed YAML likemetadata: nullwill raiseAttributeErrorhere and fail the whole batch instead of just skipping that release.Suggested hardening
try: + if not isinstance(release_data, dict): + logger.debug(f"Invalid StateBox data for {release}: {release_data!r}") + return False + # Extract release_date from StateBox metadata - release_date_str = release_data.get("metadata", {}).get("release_date") + metadata = release_data.get("metadata") + if not isinstance(metadata, dict): + logger.debug(f"Invalid StateBox metadata for {release}: {metadata!r}") + return False + release_date_str = metadata.get("release_date") if not release_date_str: logger.debug(f"release_date not found in StateBox for {release}") return False @@ - except (ValueError, KeyError) as e: + except (AttributeError, TypeError, ValueError) as e: logger.debug(f"Failed to parse release_date for {release}: {e}") return False🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oar/core/release_discovery.py` around lines 445 - 461, The code assumes release_data and its "metadata" are mappings and calls .get, which will raise AttributeError for non-mapping payloads (e.g., metadata: null); update the logic in the try block to first verify types (e.g., if not isinstance(release_data, dict) or not isinstance(release_data.get("metadata", {}), dict)) then log a debug mentioning the bad payload for that release and return False (skip this release) before attempting to read release_date_str and parsing it; alternatively, coerce metadata = release_data.get("metadata") or {} only after type checks so AttributeError cannot escape.mcp_server/server.py (1)
1333-1338:⚠️ Potential issue | 🟠 MajorMove discovery work off the FastMCP event loop.
Lines 1335-1336 still run blocking GitHub I/O directly inside the async tool. Unlike
invoke_oar_command_async()andinvoke_cli_command_async(), one slow discovery call can stall unrelated MCP requests.Suggested fix
+def _discover_active_releases_sync() -> list[str]: + return ReleaseDiscovery().get_active_releases() + `@mcp.tool`() async def discover_active_releases() -> str: @@ try: - # Use ReleaseDiscovery class with authenticated GitHub API - discovery = ReleaseDiscovery() - active_releases = discovery.get_active_releases() + loop = asyncio.get_running_loop() + active_releases = await loop.run_in_executor( + CLI_THREAD_POOL, + _discover_active_releases_sync, + ) return json.dumps({"releases": active_releases})Verify that this handler is still bypassing the thread-pool pattern used elsewhere. Expected result: the wrapper helpers show
run_in_executor(...), whilediscover_active_releases()shows a directReleaseDiscovery().get_active_releases()call.#!/bin/bash set -euo pipefail printf '%s\n' '--- thread-pool patterns already in this file ---' rg -n -C3 'run_in_executor|invoke_oar_command_async|invoke_cli_command_async' mcp_server/server.py printf '\n%s\n' '--- discovery handler ---' rg -n -C3 'async def discover_active_releases|ReleaseDiscovery\(|get_active_releases\(' mcp_server/server.py🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@mcp_server/server.py` around lines 1333 - 1338, The handler is calling ReleaseDiscovery().get_active_releases() synchronously on the FastMCP event loop; change it to run the blocking GitHub I/O in the thread pool like the other wrappers (see invoke_oar_command_async / invoke_cli_command_async) by scheduling ReleaseDiscovery().get_active_releases() with loop.run_in_executor (or the existing helper that wraps run_in_executor), and return the JSON result after awaiting that offloaded call so the event loop is not blocked.
🧹 Nitpick comments (1)
tests/test_release_discovery.py (1)
138-147: Add the rc/ec case that motivated the semver switch.This test still passes with the old plain
4.x.yordering, so it doesn't lock in the behavior this PR changed. A small fixture with4.20.0-ec.1,4.20.0-rc.1, and4.20.0would cover the regression path.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_release_discovery.py` around lines 138 - 147, Update the test_get_latest_release_for_ystream to include prerelease entries in the mocked GraphQL data via setup_graphql_mock so it contains "4.20.0-ec.1", "4.20.0-rc.1", and "4.20.0" (in addition to other patch versions), then call ReleaseDiscovery().get_latest_release_for_ystream("4.20") and assert the returned latest is "4.20.0" to lock in semver ordering that treats the final release as greater than rc/ec prereleases.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@mcp_server/server.py`:
- Around line 1333-1338: The handler is calling
ReleaseDiscovery().get_active_releases() synchronously on the FastMCP event
loop; change it to run the blocking GitHub I/O in the thread pool like the other
wrappers (see invoke_oar_command_async / invoke_cli_command_async) by scheduling
ReleaseDiscovery().get_active_releases() with loop.run_in_executor (or the
existing helper that wraps run_in_executor), and return the JSON result after
awaiting that offloaded call so the event loop is not blocked.
In `@oar/core/release_discovery.py`:
- Around line 445-461: The code assumes release_data and its "metadata" are
mappings and calls .get, which will raise AttributeError for non-mapping
payloads (e.g., metadata: null); update the logic in the try block to first
verify types (e.g., if not isinstance(release_data, dict) or not
isinstance(release_data.get("metadata", {}), dict)) then log a debug mentioning
the bad payload for that release and return False (skip this release) before
attempting to read release_date_str and parsing it; alternatively, coerce
metadata = release_data.get("metadata") or {} only after type checks so
AttributeError cannot escape.
---
Nitpick comments:
In `@tests/test_release_discovery.py`:
- Around line 138-147: Update the test_get_latest_release_for_ystream to include
prerelease entries in the mocked GraphQL data via setup_graphql_mock so it
contains "4.20.0-ec.1", "4.20.0-rc.1", and "4.20.0" (in addition to other patch
versions), then call ReleaseDiscovery().get_latest_release_for_ystream("4.20")
and assert the returned latest is "4.20.0" to lock in semver ordering that
treats the final release as greater than rc/ec prereleases.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f89f64a5-f7a6-4839-8e61-57dcccfa5735
📒 Files selected for processing (4)
mcp_server/server.pyoar/core/release_discovery.pytests/test_release_discovery.pytools/release_progress_dashboard/release_progress_dashboard.py
- Extract blocking GitHub API calls into _discover_active_releases_sync() - Use run_in_executor() to avoid blocking FastMCP event loop - Update docstring: remove ConfigStore references, mention StateBox
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@mcp_server/server.py`:
- Around line 50-53: Runtime metadata still shows old counts after adding the
discovery tool; locate where tool_count and the API category counts are computed
(search for the tool_count variable and the dict key "api" in server.py) and
update the logic to compute counts dynamically from the current tools
registry/list rather than hard-coded values, ensuring the new discovery tool
(discover-active-releases) is included; also update the startup logging block
that emits the total and per-category tool counts so the health endpoint and
startup diagnostics reflect the recalculated values.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8e2274cb-083d-4930-abc9-c8f4f179de38
📒 Files selected for processing (1)
mcp_server/server.py
| - 1 discovery tool (discover-active-releases) | ||
| - 3 cache management tools (mcp_cache_stats, mcp_cache_invalidate, mcp_cache_warm) | ||
|
|
||
| Total: 28 tools (100% optimized - all CLI commands use direct Click invocation) | ||
| Total: 29 tools (100% optimized - all CLI commands use direct Click invocation) |
There was a problem hiding this comment.
Runtime tool metadata is stale after adding the discovery tool
Docs now state 29 tools, but runtime reporting still returns old values (tool_count=28 on Line 1887, "api": 11 on Line 1902, and startup log counts on Line 1990-Line 1992). This makes health output and startup diagnostics inaccurate.
Suggested fix
- # Based on server.py: 28 total tools
- tool_count = 28
+ # Based on server.py: 29 total tools
+ tool_count = 29
@@
- "api": 11 # 4 config + 4 issue + 3 cache
+ "api": 12 # 4 config + 4 issue + 1 discovery + 3 cache
@@
- logger.info(f"✓ Total tools: 28 (17 CLI + 11 direct API)")
+ logger.info(f"✓ Total tools: 29 (17 CLI + 12 direct API)")
@@
- logger.info(f"✓ Direct API tools: 4 config + 4 issue + 3 cache")
+ logger.info(f"✓ Direct API tools: 4 config + 4 issue + 1 discovery + 3 cache")As per coding guidelines, "-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@mcp_server/server.py` around lines 50 - 53, Runtime metadata still shows old
counts after adding the discovery tool; locate where tool_count and the API
category counts are computed (search for the tool_count variable and the dict
key "api" in server.py) and update the logic to compute counts dynamically from
the current tools registry/list rather than hard-coded values, ensuring the new
discovery tool (discover-active-releases) is included; also update the startup
logging block that emits the total and per-category tool counts so the health
endpoint and startup diagnostics reflect the recalculated values.
|
Hello @rioliu-rh , thanks for your points above, I updated the PR with refactored code with GraphQL batch fetching and other updates as commented in your comments above. Could you please have a look? Thanks! |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@mcp_server/server.py`:
- Around line 1320-1321: The startup currently calls ReleaseDiscovery() (used by
discover_active_releases()) which requires GITHUB_TOKEN per
ReleaseDiscovery.__init__, but ConfigStore.validate_environment() and
check_env.sh do not mark GITHUB_TOKEN as required, causing health checks to miss
this failure; fix by adding "GITHUB_TOKEN" to the required environment variables
in ConfigStore.validate_environment() (oar/core/configstore.py) and update
check_env.sh to mark it required, or alternatively add an explicit validation
call for GITHUB_TOKEN in the MCP server startup before instantiating
ReleaseDiscovery() so the failure surfaces during startup rather than at
runtime.
In `@oar/core/release_discovery.py`:
- Around line 300-306: The parse blocks that call yaml.safe_load (e.g., the one
setting tracking_data[y_stream] and the similar StateBox parsing around the
later block) currently raise ReleaseDiscoveryException on yaml.YAMLError; change
both to swallow parsing errors instead: wrap yaml.safe_load in a try/except, on
success assign to tracking_data[y_stream] (or the StateBox container) and
logger.debug as before, but on yaml.YAMLError log a warning/error with the
y_stream/identifier and the exception details and continue (do not raise); this
preserves existing behavior for valid blobs while skipping malformed YAML.
- Around line 81-83: Remove the eager REST call to get_repo() that creates an
unused Repository object: stop assigning self.repo =
self._github.get_repo(self.repo_name) in ReleaseDiscovery.__init__ so the code
only constructs Auth.Token(token) and Github(auth=auth) (self._github) and
defers any repository lookup until/if needed via GraphQL; ensure no other code
expects self.repo to exist (update callers to use self._github when they need to
fetch repo lazily).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a97283ce-439b-4c63-ad01-7b26e01f83fe
📒 Files selected for processing (2)
mcp_server/server.pyoar/core/release_discovery.py
| discovery = ReleaseDiscovery() | ||
| return discovery.get_active_releases() |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '%s\n' '--- validate_environment implementation ---'
rg -n -C8 '\bdef validate_environment\s*\(' --type=py || true
printf '\n%s\n' '--- GITHUB_TOKEN references in Python code ---'
rg -n -C2 '\bGITHUB_TOKEN\b' --type=py || true
printf '\n%s\n' '--- GITHUB_TOKEN references in deployment/runtime files ---'
rg -n -C1 '\bGITHUB_TOKEN\b' -g 'Dockerfile*' -g '*.yml' -g '*.yaml' -g '*.env' -g '*.sh' || trueRepository: openshift/release-tests
Length of output: 10180
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '%s\n' '--- Full validate_environment() implementation ---'
rg -n -A40 'def validate_environment' oar/core/configstore.py --type=py
printf '\n%s\n' '--- MCP server health/startup validation ---'
rg -n -B5 -A10 'def health|def startup|/health' mcp_server/server.py --type=py | head -100Repository: openshift/release-tests
Length of output: 2677
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '%s\n' '--- Full health_check endpoint implementation ---'
rg -n -A30 '@mcp.custom_route\("/health"' mcp_server/server.py --type=py
printf '\n%s\n' '--- Check what validate_environment validates (continued) ---'
rg -n -A80 'required_vars = {' oar/core/configstore.py --type=py | head -60
printf '\n%s\n' '--- Check mcp_server check_env.sh for required vs optional ---'
cat mcp_server/check_env.shRepository: openshift/release-tests
Length of output: 1616
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '%s\n' '--- Full validate_environment validates (required vars section) ---'
sed -n '69,120p' oar/core/configstore.py
printf '\n%s\n' '--- MCP server check_env.sh full content ---'
cat mcp_server/check_env.sh
printf '\n%s\n' '--- Health check implementation (full endpoint) ---'
sed -n '1838,1920p' mcp_server/server.pyRepository: openshift/release-tests
Length of output: 10273
Add GITHUB_TOKEN validation to startup checks.
Line 1320 introduces a hard dependency on GITHUB_TOKEN via ReleaseDiscovery(), but ConfigStore.validate_environment() does not include it in required checks. This creates an operational gap: /health reports green while discover_active_releases() fails on every call.
GITHUB_TOKEN is currently marked optional in check_env.sh (line 62) and missing from the required vars list in validate_environment() (oar/core/configstore.py lines 69–97), even though ReleaseDiscovery.__init__ (oar/core/release_discovery.py line 71–73) explicitly raises an exception if absent.
Move GITHUB_TOKEN to required environment variables or add explicit validation in the MCP server startup path.
Also applies to: 1863–1865, 1985–1988
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@mcp_server/server.py` around lines 1320 - 1321, The startup currently calls
ReleaseDiscovery() (used by discover_active_releases()) which requires
GITHUB_TOKEN per ReleaseDiscovery.__init__, but
ConfigStore.validate_environment() and check_env.sh do not mark GITHUB_TOKEN as
required, causing health checks to miss this failure; fix by adding
"GITHUB_TOKEN" to the required environment variables in
ConfigStore.validate_environment() (oar/core/configstore.py) and update
check_env.sh to mark it required, or alternatively add an explicit validation
call for GITHUB_TOKEN in the MCP server startup before instantiating
ReleaseDiscovery() so the failure surfaces during startup rather than at
runtime.
| auth = Auth.Token(token) | ||
| self._github = Github(auth=auth) | ||
| self.repo = self._github.get_repo(self.repo_name) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# 1. Check if self.repo is used anywhere in the file
rg "self\.repo" oar/core/release_discovery.pyRepository: openshift/release-tests
Length of output: 259
🏁 Script executed:
# 2. Check how ReleaseDiscovery is instantiated in mcp_server/server.py
cat -n mcp_server/server.py | head -200Repository: openshift/release-tests
Length of output: 9356
🏁 Script executed:
# 3. Look at the full constructor and YAML handling in release_discovery.py
wc -l oar/core/release_discovery.pyRepository: openshift/release-tests
Length of output: 100
🏁 Script executed:
# 4. Check lines around 300-306 and 403-409 for YAML error handling
sed -n '75,90p' oar/core/release_discovery.py
sed -n '295,310p' oar/core/release_discovery.py
sed -n '398,415p' oar/core/release_discovery.pyRepository: openshift/release-tests
Length of output: 2568
🏁 Script executed:
# Search for ANY reference to self.repo in the entire release_discovery.py file
rg "self\.repo(?!_)" oar/core/release_discovery.py -nRepository: openshift/release-tests
Length of output: 295
🏁 Script executed:
# Search for ReleaseDiscovery instantiation in mcp_server/server.py
rg "ReleaseDiscovery" mcp_server/server.py -n -A 5 -B 2Repository: openshift/release-tests
Length of output: 1354
🏁 Script executed:
# Get the full discover-active-releases tool implementation
sed -n '1,462p' oar/core/release_discovery.py | tail -100Repository: openshift/release-tests
Length of output: 4020
🏁 Script executed:
# Search for all occurrences of "repo" (case-insensitive) to catch any usage
rg "\.repo" oar/core/release_discovery.py -nRepository: openshift/release-tests
Length of output: 268
🏁 Script executed:
# Look at the constructor and early method definitions more carefully
sed -n '75,120p' oar/core/release_discovery.pyRepository: openshift/release-tests
Length of output: 1844
🏁 Script executed:
# Check if self.repo might be used in any method
ast-grep --pattern 'def $METHOD($_) { $$$ self.repo $$$ }'Repository: openshift/release-tests
Length of output: 49
🌐 Web query:
PyGithub Github.get_repo() REST API call behavior
💡 Result:
The PyGithub method Github.get_repo(full_name_or_id, lazy=False) makes a REST API call to the GitHub API to retrieve a Repository object. It calls GET /repos/{owner}/{repo} when provided with a full name string like "owner/repo", or GET /repositories/{id} when provided with a repository ID integer. This is documented in the official PyGithub documentation: "get_repo(full_name_or_id: int | str, lazy: bool = False) → Repository Calls GET /repos/{owner}/{repo} or GET /repositories/{id}" If lazy=True, it creates a partially initialized CompletableGithubObject. Accessing uninitialized attributes triggers an additional request to complete the object with all attributes. By default (lazy=False), it fetches the full repository details immediately upon calling get_repo. Examples throughout the docs show usage like repo = g.get_repo("PyGithub/PyGithub") followed by immediate access to attributes like repo.stargazers_count, repo.get_topics, etc., confirming the initial API call populates the object. The PyGithub library (latest v2.9.0 as of 2026-03-22) wraps the GitHub REST API v3, enabling typed interactions with resources like repositories.
Citations:
- 1: https://pygithub.readthedocs.io/en/stable/github_objects/Repository.html
- 2: https://pygithub.readthedocs.io/en/v2.6.0/github.html
- 3: https://pygithub.readthedocs.io/en/v2.6.1/github.html
- 4: https://pygithub.readthedocs.io/en/stable/examples/Repository.html
- 5: https://pygithub.readthedocs.io/en/latest/examples/Repository.html
- 6: https://github.com/PyGithub/PyGithub
🏁 Script executed:
# Verify the ReleaseDiscovery constructor doesn't receive repo_name argument in mcp_server
sed -n '1315,1325p' mcp_server/server.pyRepository: openshift/release-tests
Length of output: 405
🏁 Script executed:
# Check ReleaseDiscovery.__init__ signature to see default values
sed -n '60,85p' oar/core/release_discovery.pyRepository: openshift/release-tests
Length of output: 1122
Remove the unused eager get_repo() call.
Line 83 creates a Repository object that is never used. Since PyGithub.get_repo() performs an immediate REST API call and mcp_server/server.py creates a fresh ReleaseDiscovery() per request (line 1320), every discovery request pays that latency and adds a failure point before the GraphQL path even starts, with no functional benefit.
Suggested change
auth = Auth.Token(token)
self._github = Github(auth=auth)
- self.repo = self._github.get_repo(self.repo_name)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| auth = Auth.Token(token) | |
| self._github = Github(auth=auth) | |
| self.repo = self._github.get_repo(self.repo_name) | |
| auth = Auth.Token(token) | |
| self._github = Github(auth=auth) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oar/core/release_discovery.py` around lines 81 - 83, Remove the eager REST
call to get_repo() that creates an unused Repository object: stop assigning
self.repo = self._github.get_repo(self.repo_name) in ReleaseDiscovery.__init__
so the code only constructs Auth.Token(token) and Github(auth=auth)
(self._github) and defers any repository lookup until/if needed via GraphQL;
ensure no other code expects self.repo to exist (update callers to use
self._github when they need to fetch repo lazily).
| try: | ||
| tracking_data[y_stream] = yaml.safe_load(yaml_text) | ||
| logger.debug(f"Parsed tracking file for y-stream {y_stream}") | ||
| except yaml.YAMLError as e: | ||
| raise ReleaseDiscoveryException( | ||
| f"Failed to parse tracking file for {y_stream}: {e}" | ||
| ) from e |
There was a problem hiding this comment.
Skip malformed YAML blobs instead of failing the entire batch.
These two parse blocks still abort discovery on the first bad tracking or StateBox file. One corrupted historical file would make the slash-command/dashboard path return no discoveries at all even though the other releases are still readable.
Suggested change
try:
tracking_data[y_stream] = yaml.safe_load(yaml_text)
logger.debug(f"Parsed tracking file for y-stream {y_stream}")
except yaml.YAMLError as e:
- raise ReleaseDiscoveryException(
- f"Failed to parse tracking file for {y_stream}: {e}"
- ) from e
+ logger.warning(f"Skipping malformed tracking file for {y_stream}: {e}")
@@
try:
statebox_data[release] = yaml.safe_load(yaml_text)
logger.debug(f"Parsed StateBox file for release {release}")
except yaml.YAMLError as e:
- raise ReleaseDiscoveryException(
- f"Failed to parse StateBox file for {release}: {e}"
- ) from e
+ logger.warning(f"Skipping malformed StateBox file for {release}: {e}")Also applies to: 403-409
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oar/core/release_discovery.py` around lines 300 - 306, The parse blocks that
call yaml.safe_load (e.g., the one setting tracking_data[y_stream] and the
similar StateBox parsing around the later block) currently raise
ReleaseDiscoveryException on yaml.YAMLError; change both to swallow parsing
errors instead: wrap yaml.safe_load in a try/except, on success assign to
tracking_data[y_stream] (or the StateBox container) and logger.debug as before,
but on yaml.YAMLError log a warning/error with the y_stream/identifier and the
exception details and continue (do not raise); this preserves existing behavior
for valid blobs while skipping malformed YAML.
There was a problem hiding this comment.
♻️ Duplicate comments (2)
oar/core/release_discovery.py (2)
81-83:⚠️ Potential issue | 🟠 MajorDrop the unused eager
get_repo()lookup.
self.repois never read in this module, butGithub.get_repo()adds a REST call and extra failure point.mcp_server/server.py, Lines 1315-1322 creates a freshReleaseDiscovery()per tool invocation, so every slash-command/dashboard request pays that cost before the GraphQL path even starts.♻️ Proposed fix
auth = Auth.Token(token) self._github = Github(auth=auth) - self.repo = self._github.get_repo(self.repo_name)#!/bin/bash # Verify that self.repo is only assigned here and a new ReleaseDiscovery # instance is created for each MCP invocation. rg -nC2 '\bself\.repo\b' oar/core/release_discovery.py rg -nC3 'ReleaseDiscovery\(' mcp_server/server.py🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oar/core/release_discovery.py` around lines 81 - 83, The current code eagerly calls Github.get_repo() and assigns to self.repo (auth = Auth.Token(token); self._github = Github(auth=auth); self.repo = self._github.get_repo(self.repo_name)), which is unused and causes unnecessary REST calls; remove the eager lookup by deleting the self.repo assignment and only keep creating the Github client (self._github = Github(auth=auth)), or defer get_repo() to where it is actually needed (e.g., callers that use self._github can invoke self._github.get_repo(self.repo_name) when required); ensure ReleaseDiscovery construction in mcp_server/server.py still works without self.repo being present.
298-304:⚠️ Potential issue | 🟠 MajorSkip malformed YAML blobs instead of failing the whole discovery run.
A single bad tracking or StateBox file still raises
ReleaseDiscoveryException.mcp_server/server.py, Lines 1325-1364 converts that into{"releases": []}, so one corrupted blob blanks the slash command/dashboard result even when the other releases are fine.♻️ Proposed fix
if yaml_text: try: - tracking_data[y_stream] = yaml.safe_load(yaml_text) - logger.debug(f"Parsed tracking file for y-stream {y_stream}") + parsed = yaml.safe_load(yaml_text) except yaml.YAMLError as e: - raise ReleaseDiscoveryException( - f"Failed to parse tracking file for {y_stream}: {e}" - ) from e + logger.warning(f"Skipping malformed tracking file for {y_stream}: {e}") + break + if not isinstance(parsed, dict): + logger.warning( + f"Skipping tracking file for {y_stream}: expected mapping, got {type(parsed).__name__}" + ) + break + tracking_data[y_stream] = parsed + logger.debug(f"Parsed tracking file for y-stream {y_stream}") @@ if blob_object and blob_object.get("text"): yaml_text = blob_object.get("text") try: - statebox_data[release] = yaml.safe_load(yaml_text) - logger.debug(f"Parsed StateBox file for release {release}") + parsed = yaml.safe_load(yaml_text) except yaml.YAMLError as e: - raise ReleaseDiscoveryException( - f"Failed to parse StateBox file for {release}: {e}" - ) from e + logger.warning(f"Skipping malformed StateBox file for {release}: {e}") + continue + if not isinstance(parsed, dict): + logger.warning( + f"Skipping StateBox file for {release}: expected mapping, got {type(parsed).__name__}" + ) + continue + statebox_data[release] = parsed + logger.debug(f"Parsed StateBox file for release {release}")Also applies to: 395-401
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@oar/core/release_discovery.py` around lines 298 - 304, The code currently raises ReleaseDiscoveryException when yaml.safe_load fails (the try/except around tracking_data[y_stream] = yaml.safe_load(...) and the analogous block around StateBox parsing), which aborts the whole discovery run; change both handlers to catch yaml.YAMLError, log a warning with the y_stream (or StateBox identifier) and the parsing error, and simply continue (skip that malformed blob) instead of re-raising the exception so other releases are still processed. Ensure you update the two catch sites (the block assigning tracking_data[y_stream] and the similar StateBox parsing block) to use logger.warning/debug with context and not raise ReleaseDiscoveryException.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@oar/core/release_discovery.py`:
- Around line 81-83: The current code eagerly calls Github.get_repo() and
assigns to self.repo (auth = Auth.Token(token); self._github =
Github(auth=auth); self.repo = self._github.get_repo(self.repo_name)), which is
unused and causes unnecessary REST calls; remove the eager lookup by deleting
the self.repo assignment and only keep creating the Github client (self._github
= Github(auth=auth)), or defer get_repo() to where it is actually needed (e.g.,
callers that use self._github can invoke self._github.get_repo(self.repo_name)
when required); ensure ReleaseDiscovery construction in mcp_server/server.py
still works without self.repo being present.
- Around line 298-304: The code currently raises ReleaseDiscoveryException when
yaml.safe_load fails (the try/except around tracking_data[y_stream] =
yaml.safe_load(...) and the analogous block around StateBox parsing), which
aborts the whole discovery run; change both handlers to catch yaml.YAMLError,
log a warning with the y_stream (or StateBox identifier) and the parsing error,
and simply continue (skip that malformed blob) instead of re-raising the
exception so other releases are still processed. Ensure you update the two catch
sites (the block assigning tracking_data[y_stream] and the similar StateBox
parsing block) to use logger.warning/debug with context and not raise
ReleaseDiscoveryException.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b3e1234d-aff6-4076-8b9e-366821aae6d1
📒 Files selected for processing (1)
oar/core/release_discovery.py
|
@LuboTerifaj: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
https://redhat.atlassian.net/browse/OCPERT-347