Skip to content

OCPERT-347 Automate active release discovery for slash command and dashboard#979

Open
LuboTerifaj wants to merge 9 commits intoopenshift:mainfrom
LuboTerifaj:ocpert-347
Open

OCPERT-347 Automate active release discovery for slash command and dashboard#979
LuboTerifaj wants to merge 9 commits intoopenshift:mainfrom
LuboTerifaj:ocpert-347

Conversation

@LuboTerifaj
Copy link
Copy Markdown
Contributor

Enable auto-discovery for Release Progress Dashboard

- Add discover_active_releases MCP tool
- Add auto-load button to dashboard for discovering active releases
- Update active-releases slash command to use discovery tool
- Move logic from MCP server to oar.core.release_discovery module
- Use GitHub API with authentication for discovery and ConfigStore for date filtering
- Add unit tests with mocked GitHub API and ConfigStore

Used by MCP server to power Release Progress Dashboard and Claude Code slash command
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 30, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds a GitHub-backed ReleaseDiscovery class and exception, a new MCP tool exposing active z-stream releases, client/data-collector wrappers, Streamlit UI integration to load discovered releases, unit tests for discovery logic, and a command-spec documentation file.

Changes

Cohort / File(s) Summary
Release discovery core
oar/core/release_discovery.py, oar/core/exceptions.py
New ReleaseDiscovery class (GitHub GraphQL queries for _releases/*, tracking-file parsing, StateBox YAML retrieval, semver selection, caching, date-based active filtering) and ReleaseDiscoveryException.
MCP server tool
mcp_server/server.py
Adds discover_active_releases() MCP tool that runs ReleaseDiscovery.get_active_releases() on the CLI thread pool; updates tool counts and health metadata.
MCP client / data collector
tools/mcp_data_collector.py
Adds discover_active_releases_async() and discover_active_releases() wrappers to call the new MCP tool and return parsed results.
Streamlit dashboard UI
tools/release_progress_dashboard/release_progress_dashboard.py
Adds sidebar "🔍 Load Active Releases" button that calls the data collector, merges discovered releases into st.session_state.releases, and shows success/info/error messages with st.rerun().
Tests
tests/test_release_discovery.py
New unit tests for ReleaseDiscovery: init with/without token, get_supported_ystreams(), get_latest_release_for_ystream(), and get_active_releases() using mocked GitHub GraphQL/blob responses and env patches.
Documentation / command spec
.claude/commands/release/active-releases.md
Adds command spec describing discovery flow, StateBox retrieval and YAML parsing, status/phase/issue handling rules, StateBox link format, sorting, and console output template.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot requested review from barboras7 and cardelearht March 30, 2026 23:56
@LuboTerifaj LuboTerifaj requested a review from rioliu-rh March 30, 2026 23:56
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/commands/release/active-releases.md:
- Around line 235-240: Update the command notes to reflect the implemented flow:
change the date filtering note to say releases are kept through release_date + 1
day (not 2 days), remove or amend the claim that only oar_get_release_status()
is used per release and instead note that the status logic also calls
oar_is_release_shipped() to determine SHIPPED, and update the
MCP-calls/parallel-execution note to indicate that more than one MCP call may be
used per release (oar_get_release_status() plus oar_is_release_shipped()) while
still executing calls in parallel for performance.

In `@mcp_server/server.py`:
- Around line 1333-1339: The handler currently constructs
ReleaseDiscovery(configstore_factory=get_cached_configstore) and calls
discovery.get_active_releases() directly (blocking IO) on the FastMCP event
loop; extract that blocking work into a synchronous helper (e.g.,
_sync_get_active_releases or similar) that creates ReleaseDiscovery and returns
the active releases, then call it from the async handler via the thread pool
using asyncio.get_running_loop().run_in_executor(None,
_sync_get_active_releases) (mirroring
invoke_oar_command_async()/invoke_cli_command_async() patterns) and return
json.dumps({"releases": result}) from the awaiting coroutine.

In `@oar/core/release_discovery.py`:
- Around line 219-235: _is_release_active currently only catches
ConfigStoreException, so a malformed date from ConfigStore.get_release_date can
raise ValueError and abort discovery; update _is_release_active to additionally
catch ValueError (and optionally TypeError) around the datetime.strptime
parsing, log a warning including the release id and the offending
release_date_str, and return False so get_active_releases() skips that release
instead of failing the whole run.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8ca824bc-6379-4361-babe-9340c78d6bfd

📥 Commits

Reviewing files that changed from the base of the PR and between e6c25c2 and 84c6138.

📒 Files selected for processing (7)
  • .claude/commands/release/active-releases.md
  • mcp_server/server.py
  • oar/core/exceptions.py
  • oar/core/release_discovery.py
  • tests/test_release_discovery.py
  • tools/mcp_data_collector.py
  • tools/release_progress_dashboard/release_progress_dashboard.py

Comment on lines +235 to +240
- **Auto-discovery**: Automatically finds releases from tracking files (no hardcoded lists)
- **Date filtering**: Excludes releases with release_date older than 2 days
- **StateBox required**: Only shows releases with StateBox (ready for QE work)
- **Timing**: Releases appear 1-2 days after cut-off when ART creates StateBox
- **Single MCP call per release**: Only `oar_get_release_status()` - contains all needed data
- **Parallel execution**: Call MCP tools in parallel for all releases for performance
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Align the command notes with the implemented flow.

This footer says discovery excludes releases older than 2 days and that only oar_get_release_status() is needed per release. The new implementation keeps releases only through release_date + 1 day, and the status logic above still depends on oar_is_release_shipped() for SHIPPED.

Proposed fix
-- **Date filtering**: Excludes releases with release_date older than 2 days
+- **Date filtering**: Excludes releases more than 1 day past `release_date`
 - **StateBox required**: Only shows releases with StateBox (ready for QE work)
 - **Timing**: Releases appear 1-2 days after cut-off when ART creates StateBox
-- **Single MCP call per release**: Only `oar_get_release_status()` - contains all needed data
+- **Per-release MCP calls**: Use `oar_get_release_status()` for task/issue data and `oar_is_release_shipped()` for live shipment state
 - **Parallel execution**: Call MCP tools in parallel for all releases for performance
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- **Auto-discovery**: Automatically finds releases from tracking files (no hardcoded lists)
- **Date filtering**: Excludes releases with release_date older than 2 days
- **StateBox required**: Only shows releases with StateBox (ready for QE work)
- **Timing**: Releases appear 1-2 days after cut-off when ART creates StateBox
- **Single MCP call per release**: Only `oar_get_release_status()` - contains all needed data
- **Parallel execution**: Call MCP tools in parallel for all releases for performance
- **Auto-discovery**: Automatically finds releases from tracking files (no hardcoded lists)
- **Date filtering**: Excludes releases more than 1 day past `release_date`
- **StateBox required**: Only shows releases with StateBox (ready for QE work)
- **Timing**: Releases appear 1-2 days after cut-off when ART creates StateBox
- **Per-release MCP calls**: Use `oar_get_release_status()` for task/issue data and `oar_is_release_shipped()` for live shipment state
- **Parallel execution**: Call MCP tools in parallel for all releases for performance
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/commands/release/active-releases.md around lines 235 - 240, Update
the command notes to reflect the implemented flow: change the date filtering
note to say releases are kept through release_date + 1 day (not 2 days), remove
or amend the claim that only oar_get_release_status() is used per release and
instead note that the status logic also calls oar_is_release_shipped() to
determine SHIPPED, and update the MCP-calls/parallel-execution note to indicate
that more than one MCP call may be used per release (oar_get_release_status()
plus oar_is_release_shipped()) while still executing calls in parallel for
performance.

Copy link
Copy Markdown
Contributor

@rioliu-rh rioliu-rh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for implementing the active release discovery feature! The overall approach is solid, but there are 3 critical performance and correctness issues that need to be addressed before merging.

Summary of Issues

1. 🔴 CRITICAL: Expensive ConfigStore Initialization (line 226)

  • Problem: Creates ConfigStore for each release (~1000ms each) just to read release_date
  • Impact: Adds 10+ seconds overhead for 10 y-streams
  • Fix: Query StateBox YAML directly (10x faster)

2. 🔴 CRITICAL: Incorrect Semver Parsing (line 154)

  • Problem: Uses naive int() conversion - breaks on pre-release tags (e.g., 4.20.0-rc1)
  • Impact: Crashes on certain version formats, inconsistent with detector.py patterns
  • Fix: Use sorted() with semver.VersionInfo.parse() (matches detector.py:34-37 pattern)

3. 🟡 Performance: Sequential REST API Calls (line 195)

  • Problem: Makes 11+ sequential REST calls to discover y-streams
  • Impact: 3-8 seconds of network latency
  • Fix: Use GraphQL to batch-fetch directory structure in 1 call

4. 🟡 Future-proofing: Hardcoded OCP 4.x Regex (line 102)

  • Problem: Only supports 4.\d{1,2} pattern
  • Impact: Breaks when OCP 5.x is released
  • Fix: Use \d+.\d{1,2} to support any major version

Performance Impact

Component Before After Improvement
Y-stream discovery 11 REST calls (3-8s) 1 GraphQL call (0.5s) ~6-16x faster
Release date lookup 10 ConfigStore inits (10s) 10 StateBox reads (1s) 10x faster
Total Time 13-18 seconds 2-3 seconds ~8x faster

Action Items

Please address the inline comments with code examples. All fixes follow existing patterns in the codebase:

  • Semver usage → see oar/controller/detector.py:34-37 (sorted approach) and detector.py:62-65 (VersionInfo)
  • StateBox querying → see oar/core/statebox.py
  • GraphQL queries → tested and verified to work with GitHub API

Let me know if you have any questions about the proposed solutions!

contents = self.repo.get_contents(self.RELEASES_PATH, ref=self.branch)

# Extract y-stream versions from directory names
y_streams = []
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: Hardcoded regex only supports OCP 4.x versions

Why this is a problem:
This regex pattern ^4\.\d{1,2}$ is hardcoded to match only versions starting with "4." (e.g., 4.12, 4.20). When OpenShift releases 5.x or 10.x versions in the future, this code will break and fail to discover those releases.

Solution:
Make the regex future-proof by supporting any major version:

# Support any major version: 4.x, 5.x, 10.x, etc.
pattern = re.compile(r'^\d+\.\d{1,2}$')

This matches the same pattern used in oar/controller/detector.py:97 but is more future-proof.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good catch! Thanks.


for y_stream in y_streams:
latest_release = self.get_latest_release_for_ystream(y_stream)
if not latest_release:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: Inefficient API usage - sequential REST calls and redundant logic

Why this is a problem:
Current implementation has two inefficiencies:

  1. Sequential REST calls: Makes 11+ sequential REST calls (1 for _releases/ + 10 for each 4.X.z.yaml)
  2. Redundant StateBox listing: After getting latest release from tracking file, the current approach would list StateBox directory again to find the same release

Solution:
Use PyGithub's native GraphQL support to batch-fetch tracking files, then directly access StateBox:

from semver import VersionInfo

def get_active_releases(self, keep_days_after_release: int = 1) -> List[str]:
    """Fast discovery using GraphQL to batch-fetch tracking files"""
    
    # Use PyGithub's native GraphQL support (no requests library needed)
    query = """
    query {
      repository(owner: "openshift", name: "release-tests") {
        object(expression: "z-stream:_releases") {
          ... on Tree {
            entries {
              name
              type
              object {
                ... on Tree {
                  entries {
                    name
                    type
                    object {
                      ... on Blob {
                        text
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
    """
    
    # Use PyGithub's requestJsonAndCheck (native GraphQL support)
    headers, data = self._github._Github__requester.requestJsonAndCheck(
        "POST",
        "/graphql",
        input={"query": query}
    )
    
    active_releases = []
    pattern = re.compile(r'^\d+\.\d{1,2}$')  # Future-proof: 4.x, 5.x, 10.x
    
    # Process each y-stream directory
    for y_stream_dir in data['data']['repository']['object']['entries']:
        if y_stream_dir['type'] != 'tree':
            continue
        
        y_stream = y_stream_dir['name']
        if not pattern.match(y_stream):
            continue
        
        # Find the tracking file: {y-stream}.z.yaml
        tracking_file = f"{y_stream}.z.yaml"
        for file_entry in y_stream_dir['object']['entries']:
            if file_entry['name'] == tracking_file and file_entry['type'] == 'blob':
                try:
                    # Parse YAML content from GraphQL response
                    yaml_content = file_entry['object']['text']
                    tracking_data = yaml.safe_load(yaml_content)
                    
                    # Get all releases and find latest using semver
                    releases = list(tracking_data.get("releases", {}).keys())
                    if not releases:
                        continue
                    
                    # Sort using semver (like detector.py:34-37)
                    sorted_releases = sorted(releases, key=lambda v: VersionInfo.parse(v))
                    latest = sorted_releases[-1]
                    
                    # Check if active (queries StateBox directly - see line 226 comment)
                    if self._is_release_active(latest, keep_days_after_release):
                        active_releases.append(latest)
                        
                except Exception as e:
                    logger.debug(f"Error processing y-stream {y_stream}: {e}")
                break
    
    logger.info(f"Discovered {len(active_releases)} active releases: {active_releases}")
    return active_releases

Key improvements:

  1. PyGithub native GraphQL: Uses requestJsonAndCheck() (no requests library import)
  2. Single API call: Fetches all tracking files in one GraphQL query
  3. No redundant StateBox listing: Directly fetches StateBox by filename (handled in _is_release_active())
  4. Performance: 11+ REST calls (~3-8s) → 1 GraphQL call (~0.5s) = 6-16x faster

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I will update it to use GraphQL.


if not releases:
logger.debug(f"No releases found for y-stream {y_stream}")
return None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: Incorrect semver parsing using naive int() conversion

Why this is a problem:
This approach tuple(map(int, v.split('.'))) is problematic:

  1. Fails for pre-release tags: Crashes on versions like 4.20.0-rc1 (can't convert '0-rc1' to int)
  2. Doesn't follow semver rules: Naive integer comparison doesn't handle semantic versioning correctly
  3. Inconsistent with codebase: oar/controller/detector.py:62-65 uses VersionInfo.parse() for version comparison

Solution:
Use sorted() with semver.VersionInfo (matches detector.py pattern):

from semver import VersionInfo

def get_latest_release_for_ystream(self, y_stream: str) -> Optional[str]:
    # ... (fetch releases code) ...
    
    if not releases:
        logger.debug(f"No releases found for y-stream {y_stream}")
        return None
    
    # Sort using semver (similar to detector.py:34-37, but with proper semver)
    sorted_releases = sorted(releases, key=lambda v: VersionInfo.parse(v))
    latest = sorted_releases[-1]
    
    logger.debug(f"Latest release for {y_stream}: {latest}")
    return latest

Why sorted() instead of max():
Following the pattern in detector.py:34-37 which uses sorted() and takes the last element. Both work correctly, but sorted() is more explicit and matches existing codebase patterns.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree on using semver, especially for possible future cases if we support versions beyond z-streams (e.g., pre-release tags like rc, ec).

However, I'm using max() instead of sorted() because:

  • More efficient: O(n) vs O(n log n)
  • Clearer intent: we only need the maximum, not a sorted list
  • More Pythonic: max() is the standard idiom for "find largest"


# Parse date in format: 2026-Mar-25
release_date = datetime.strptime(release_date_str, "%Y-%b-%d").date()
today = datetime.now().date()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRITICAL Issue: Expensive ConfigStore initialization

Why this is a problem:
Creating a ConfigStore instance for each release is extremely expensive:

  • Each ConfigStore init costs ~1000ms (JWE decryption + GitHub HTTP request + YAML parsing)
  • For 10 y-streams, this adds 10+ seconds of overhead
  • ConfigStore data is immutable, but we only need one field: release_date
  • Even with MCP server caching, cache misses are very expensive

Solution:
Query release_date directly from StateBox YAML instead:

def _is_release_active(self, release: str, keep_days: int) -> bool:
    """
    Check if release is within date window (release_date + keep_days).
    
    Queries StateBox YAML directly instead of creating expensive ConfigStore instance.
    
    Args:
        release: Release version (e.g., "4.20.17")
        keep_days: Number of days after release_date to keep visible
    
    Returns:
        True if release is active, False otherwise
    """
    try:
        # Extract y-stream from release (e.g., "4.20.17" -> "4.20")
        y_stream = '.'.join(release.split('.')[:2])
        statebox_path = f"{self.RELEASES_PATH}/{y_stream}/statebox/{release}.yaml"
        
        # Fetch StateBox file
        file_content = self.repo.get_contents(statebox_path, ref=self.branch)
        statebox_data = yaml.safe_load(file_content.decoded_content.decode('utf-8'))
        
        # Extract release_date from metadata
        release_date_str = statebox_data['metadata']['release_date']
        
        # Parse date (format: "2026-Mar-25")
        release_date = datetime.strptime(release_date_str, "%Y-%b-%d").date()
        today = datetime.now().date()
        
        is_active = today <= release_date + timedelta(days=keep_days)
        logger.debug(
            f"Release {release} {'active' if is_active else 'past visibility window'} "
            f"(release_date: {release_date_str})"
        )
        return is_active
        
    except Exception as e:
        # StateBox not ready for this release (build data not available yet)
        logger.debug(f"StateBox not available for {release}: {e}")
        return False

Performance improvement: 10 ConfigStore inits (~10s) → 10 StateBox reads (~1s) = ~10x faster

Example StateBox structure:

schema_version: '1.0'
release: 4.20.17
metadata:
  release_date: 2026-Mar-25  # <-- This is what we need
  jira_ticket: ART-14325
  advisory_ids:
    rpm: 157165

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered several approaches initially - each has some issues:

  • ConfigStore: New cache every 7 days per version - not ideal
  • StateBox class: Calls ConfigStore in constructor, same problem
  • Direct REST calls: N+1 query issue

Per your suggestion, I explored GraphQL, which gives us two options:

  • Two queries (tracking + specific StateBox files): ~186 KB, only what we need
  • One query (all StateBox files): Simpler, but ~1,192 KB (84% waste)

Going with two queries as it is more efficient (fetches only 10 needed files) and scales better as the amount of Statebox files grows.

st.sidebar.success(f"✅ Added {added_count} active release(s)")
st.rerun()
else:
st.sidebar.info("All active releases already tracked")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: Misleading message when no active releases found

Why this is a problem:
When discovery returns an empty list (e.g., during holidays when no releases are active), the current message says:

"All active releases already tracked"

This is misleading because it implies active releases exist but are already tracked, when the reality is that no active releases were found.

Solution:
Distinguish between two scenarios:

if added_count > 0:
    st.sidebar.success(f"✅ Added {added_count} active release(s)")
    st.rerun()
else:
    # Distinguish: no releases found vs all already tracked
    if len(active_releases) == 0:
        st.sidebar.info("ℹ️ No active releases found")
    else:
        st.sidebar.info(f"All {len(active_releases)} active release(s) already tracked")

Why this matters:

  • Clear communication: User understands discovery ran but found nothing
  • Avoids confusion: Doesn't suggest releases exist when they don't
  • Future-proof: Message not tied to hardcoded keep_days_after_release=1 parameter

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, include the number of releases is a good idea to avoid confusion.

- Remove ConfigStore dependency, fetch StateBox/tracking files directly via GraphQL
- Replace REST API calls with GraphQL batch queries
- Fix regex pattern for future-proof y-stream matching
- Fix semver parsing for rc/ec version handling
- Refine dashboard message
@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Apr 1, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from rioliu-rh. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
oar/core/release_discovery.py (1)

445-461: ⚠️ Potential issue | 🟠 Major

Handle non-mapping StateBox payloads without aborting discovery.

This now skips bad date strings, but Line 447 still assumes both release_data and metadata are dicts. A malformed YAML like metadata: null will raise AttributeError here and fail the whole batch instead of just skipping that release.

Suggested hardening
         try:
+            if not isinstance(release_data, dict):
+                logger.debug(f"Invalid StateBox data for {release}: {release_data!r}")
+                return False
+
             # Extract release_date from StateBox metadata
-            release_date_str = release_data.get("metadata", {}).get("release_date")
+            metadata = release_data.get("metadata")
+            if not isinstance(metadata, dict):
+                logger.debug(f"Invalid StateBox metadata for {release}: {metadata!r}")
+                return False
+            release_date_str = metadata.get("release_date")
             if not release_date_str:
                 logger.debug(f"release_date not found in StateBox for {release}")
                 return False
@@
-        except (ValueError, KeyError) as e:
+        except (AttributeError, TypeError, ValueError) as e:
             logger.debug(f"Failed to parse release_date for {release}: {e}")
             return False
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@oar/core/release_discovery.py` around lines 445 - 461, The code assumes
release_data and its "metadata" are mappings and calls .get, which will raise
AttributeError for non-mapping payloads (e.g., metadata: null); update the logic
in the try block to first verify types (e.g., if not isinstance(release_data,
dict) or not isinstance(release_data.get("metadata", {}), dict)) then log a
debug mentioning the bad payload for that release and return False (skip this
release) before attempting to read release_date_str and parsing it;
alternatively, coerce metadata = release_data.get("metadata") or {} only after
type checks so AttributeError cannot escape.
mcp_server/server.py (1)

1333-1338: ⚠️ Potential issue | 🟠 Major

Move discovery work off the FastMCP event loop.

Lines 1335-1336 still run blocking GitHub I/O directly inside the async tool. Unlike invoke_oar_command_async() and invoke_cli_command_async(), one slow discovery call can stall unrelated MCP requests.

Suggested fix
+def _discover_active_releases_sync() -> list[str]:
+    return ReleaseDiscovery().get_active_releases()
+
 `@mcp.tool`()
 async def discover_active_releases() -> str:
@@
     try:
-        # Use ReleaseDiscovery class with authenticated GitHub API
-        discovery = ReleaseDiscovery()
-        active_releases = discovery.get_active_releases()
+        loop = asyncio.get_running_loop()
+        active_releases = await loop.run_in_executor(
+            CLI_THREAD_POOL,
+            _discover_active_releases_sync,
+        )
 
         return json.dumps({"releases": active_releases})

Verify that this handler is still bypassing the thread-pool pattern used elsewhere. Expected result: the wrapper helpers show run_in_executor(...), while discover_active_releases() shows a direct ReleaseDiscovery().get_active_releases() call.

#!/bin/bash
set -euo pipefail

printf '%s\n' '--- thread-pool patterns already in this file ---'
rg -n -C3 'run_in_executor|invoke_oar_command_async|invoke_cli_command_async' mcp_server/server.py

printf '\n%s\n' '--- discovery handler ---'
rg -n -C3 'async def discover_active_releases|ReleaseDiscovery\(|get_active_releases\(' mcp_server/server.py
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@mcp_server/server.py` around lines 1333 - 1338, The handler is calling
ReleaseDiscovery().get_active_releases() synchronously on the FastMCP event
loop; change it to run the blocking GitHub I/O in the thread pool like the other
wrappers (see invoke_oar_command_async / invoke_cli_command_async) by scheduling
ReleaseDiscovery().get_active_releases() with loop.run_in_executor (or the
existing helper that wraps run_in_executor), and return the JSON result after
awaiting that offloaded call so the event loop is not blocked.
🧹 Nitpick comments (1)
tests/test_release_discovery.py (1)

138-147: Add the rc/ec case that motivated the semver switch.

This test still passes with the old plain 4.x.y ordering, so it doesn't lock in the behavior this PR changed. A small fixture with 4.20.0-ec.1, 4.20.0-rc.1, and 4.20.0 would cover the regression path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_release_discovery.py` around lines 138 - 147, Update the
test_get_latest_release_for_ystream to include prerelease entries in the mocked
GraphQL data via setup_graphql_mock so it contains "4.20.0-ec.1", "4.20.0-rc.1",
and "4.20.0" (in addition to other patch versions), then call
ReleaseDiscovery().get_latest_release_for_ystream("4.20") and assert the
returned latest is "4.20.0" to lock in semver ordering that treats the final
release as greater than rc/ec prereleases.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@mcp_server/server.py`:
- Around line 1333-1338: The handler is calling
ReleaseDiscovery().get_active_releases() synchronously on the FastMCP event
loop; change it to run the blocking GitHub I/O in the thread pool like the other
wrappers (see invoke_oar_command_async / invoke_cli_command_async) by scheduling
ReleaseDiscovery().get_active_releases() with loop.run_in_executor (or the
existing helper that wraps run_in_executor), and return the JSON result after
awaiting that offloaded call so the event loop is not blocked.

In `@oar/core/release_discovery.py`:
- Around line 445-461: The code assumes release_data and its "metadata" are
mappings and calls .get, which will raise AttributeError for non-mapping
payloads (e.g., metadata: null); update the logic in the try block to first
verify types (e.g., if not isinstance(release_data, dict) or not
isinstance(release_data.get("metadata", {}), dict)) then log a debug mentioning
the bad payload for that release and return False (skip this release) before
attempting to read release_date_str and parsing it; alternatively, coerce
metadata = release_data.get("metadata") or {} only after type checks so
AttributeError cannot escape.

---

Nitpick comments:
In `@tests/test_release_discovery.py`:
- Around line 138-147: Update the test_get_latest_release_for_ystream to include
prerelease entries in the mocked GraphQL data via setup_graphql_mock so it
contains "4.20.0-ec.1", "4.20.0-rc.1", and "4.20.0" (in addition to other patch
versions), then call ReleaseDiscovery().get_latest_release_for_ystream("4.20")
and assert the returned latest is "4.20.0" to lock in semver ordering that
treats the final release as greater than rc/ec prereleases.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f89f64a5-f7a6-4839-8e61-57dcccfa5735

📥 Commits

Reviewing files that changed from the base of the PR and between 84c6138 and de4cbba.

📒 Files selected for processing (4)
  • mcp_server/server.py
  • oar/core/release_discovery.py
  • tests/test_release_discovery.py
  • tools/release_progress_dashboard/release_progress_dashboard.py

- Extract blocking GitHub API calls into _discover_active_releases_sync()
- Use run_in_executor() to avoid blocking FastMCP event loop
- Update docstring: remove ConfigStore references, mention StateBox
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@mcp_server/server.py`:
- Around line 50-53: Runtime metadata still shows old counts after adding the
discovery tool; locate where tool_count and the API category counts are computed
(search for the tool_count variable and the dict key "api" in server.py) and
update the logic to compute counts dynamically from the current tools
registry/list rather than hard-coded values, ensuring the new discovery tool
(discover-active-releases) is included; also update the startup logging block
that emits the total and per-category tool counts so the health endpoint and
startup diagnostics reflect the recalculated values.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8e2274cb-083d-4930-abc9-c8f4f179de38

📥 Commits

Reviewing files that changed from the base of the PR and between de4cbba and 08ae165.

📒 Files selected for processing (1)
  • mcp_server/server.py

Comment on lines +50 to +53
- 1 discovery tool (discover-active-releases)
- 3 cache management tools (mcp_cache_stats, mcp_cache_invalidate, mcp_cache_warm)

Total: 28 tools (100% optimized - all CLI commands use direct Click invocation)
Total: 29 tools (100% optimized - all CLI commands use direct Click invocation)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Runtime tool metadata is stale after adding the discovery tool

Docs now state 29 tools, but runtime reporting still returns old values (tool_count=28 on Line 1887, "api": 11 on Line 1902, and startup log counts on Line 1990-Line 1992). This makes health output and startup diagnostics inaccurate.

Suggested fix
-        # Based on server.py: 28 total tools
-        tool_count = 28
+        # Based on server.py: 29 total tools
+        tool_count = 29
@@
-                "api": 11   # 4 config + 4 issue + 3 cache
+                "api": 12   # 4 config + 4 issue + 1 discovery + 3 cache
@@
-    logger.info(f"✓ Total tools: 28 (17 CLI + 11 direct API)")
+    logger.info(f"✓ Total tools: 29 (17 CLI + 12 direct API)")
@@
-    logger.info(f"✓ Direct API tools: 4 config + 4 issue + 3 cache")
+    logger.info(f"✓ Direct API tools: 4 config + 4 issue + 1 discovery + 3 cache")

As per coding guidelines, "-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@mcp_server/server.py` around lines 50 - 53, Runtime metadata still shows old
counts after adding the discovery tool; locate where tool_count and the API
category counts are computed (search for the tool_count variable and the dict
key "api" in server.py) and update the logic to compute counts dynamically from
the current tools registry/list rather than hard-coded values, ensuring the new
discovery tool (discover-active-releases) is included; also update the startup
logging block that emits the total and per-category tool counts so the health
endpoint and startup diagnostics reflect the recalculated values.

@LuboTerifaj
Copy link
Copy Markdown
Contributor Author

Hello @rioliu-rh , thanks for your points above, I updated the PR with refactored code with GraphQL batch fetching and other updates as commented in your comments above. Could you please have a look? Thanks!

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@mcp_server/server.py`:
- Around line 1320-1321: The startup currently calls ReleaseDiscovery() (used by
discover_active_releases()) which requires GITHUB_TOKEN per
ReleaseDiscovery.__init__, but ConfigStore.validate_environment() and
check_env.sh do not mark GITHUB_TOKEN as required, causing health checks to miss
this failure; fix by adding "GITHUB_TOKEN" to the required environment variables
in ConfigStore.validate_environment() (oar/core/configstore.py) and update
check_env.sh to mark it required, or alternatively add an explicit validation
call for GITHUB_TOKEN in the MCP server startup before instantiating
ReleaseDiscovery() so the failure surfaces during startup rather than at
runtime.

In `@oar/core/release_discovery.py`:
- Around line 300-306: The parse blocks that call yaml.safe_load (e.g., the one
setting tracking_data[y_stream] and the similar StateBox parsing around the
later block) currently raise ReleaseDiscoveryException on yaml.YAMLError; change
both to swallow parsing errors instead: wrap yaml.safe_load in a try/except, on
success assign to tracking_data[y_stream] (or the StateBox container) and
logger.debug as before, but on yaml.YAMLError log a warning/error with the
y_stream/identifier and the exception details and continue (do not raise); this
preserves existing behavior for valid blobs while skipping malformed YAML.
- Around line 81-83: Remove the eager REST call to get_repo() that creates an
unused Repository object: stop assigning self.repo =
self._github.get_repo(self.repo_name) in ReleaseDiscovery.__init__ so the code
only constructs Auth.Token(token) and Github(auth=auth) (self._github) and
defers any repository lookup until/if needed via GraphQL; ensure no other code
expects self.repo to exist (update callers to use self._github when they need to
fetch repo lazily).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a97283ce-439b-4c63-ad01-7b26e01f83fe

📥 Commits

Reviewing files that changed from the base of the PR and between 08ae165 and 12c57cd.

📒 Files selected for processing (2)
  • mcp_server/server.py
  • oar/core/release_discovery.py

Comment on lines +1320 to +1321
discovery = ReleaseDiscovery()
return discovery.get_active_releases()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '%s\n' '--- validate_environment implementation ---'
rg -n -C8 '\bdef validate_environment\s*\(' --type=py || true

printf '\n%s\n' '--- GITHUB_TOKEN references in Python code ---'
rg -n -C2 '\bGITHUB_TOKEN\b' --type=py || true

printf '\n%s\n' '--- GITHUB_TOKEN references in deployment/runtime files ---'
rg -n -C1 '\bGITHUB_TOKEN\b' -g 'Dockerfile*' -g '*.yml' -g '*.yaml' -g '*.env' -g '*.sh' || true

Repository: openshift/release-tests

Length of output: 10180


🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '%s\n' '--- Full validate_environment() implementation ---'
rg -n -A40 'def validate_environment' oar/core/configstore.py --type=py

printf '\n%s\n' '--- MCP server health/startup validation ---'
rg -n -B5 -A10 'def health|def startup|/health' mcp_server/server.py --type=py | head -100

Repository: openshift/release-tests

Length of output: 2677


🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '%s\n' '--- Full health_check endpoint implementation ---'
rg -n -A30 '@mcp.custom_route\("/health"' mcp_server/server.py --type=py

printf '\n%s\n' '--- Check what validate_environment validates (continued) ---'
rg -n -A80 'required_vars = {' oar/core/configstore.py --type=py | head -60

printf '\n%s\n' '--- Check mcp_server check_env.sh for required vs optional ---'
cat mcp_server/check_env.sh

Repository: openshift/release-tests

Length of output: 1616


🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '%s\n' '--- Full validate_environment validates (required vars section) ---'
sed -n '69,120p' oar/core/configstore.py

printf '\n%s\n' '--- MCP server check_env.sh full content ---'
cat mcp_server/check_env.sh

printf '\n%s\n' '--- Health check implementation (full endpoint) ---'
sed -n '1838,1920p' mcp_server/server.py

Repository: openshift/release-tests

Length of output: 10273


Add GITHUB_TOKEN validation to startup checks.

Line 1320 introduces a hard dependency on GITHUB_TOKEN via ReleaseDiscovery(), but ConfigStore.validate_environment() does not include it in required checks. This creates an operational gap: /health reports green while discover_active_releases() fails on every call.

GITHUB_TOKEN is currently marked optional in check_env.sh (line 62) and missing from the required vars list in validate_environment() (oar/core/configstore.py lines 69–97), even though ReleaseDiscovery.__init__ (oar/core/release_discovery.py line 71–73) explicitly raises an exception if absent.

Move GITHUB_TOKEN to required environment variables or add explicit validation in the MCP server startup path.

Also applies to: 1863–1865, 1985–1988

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@mcp_server/server.py` around lines 1320 - 1321, The startup currently calls
ReleaseDiscovery() (used by discover_active_releases()) which requires
GITHUB_TOKEN per ReleaseDiscovery.__init__, but
ConfigStore.validate_environment() and check_env.sh do not mark GITHUB_TOKEN as
required, causing health checks to miss this failure; fix by adding
"GITHUB_TOKEN" to the required environment variables in
ConfigStore.validate_environment() (oar/core/configstore.py) and update
check_env.sh to mark it required, or alternatively add an explicit validation
call for GITHUB_TOKEN in the MCP server startup before instantiating
ReleaseDiscovery() so the failure surfaces during startup rather than at
runtime.

Comment on lines +81 to +83
auth = Auth.Token(token)
self._github = Github(auth=auth)
self.repo = self._github.get_repo(self.repo_name)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# 1. Check if self.repo is used anywhere in the file
rg "self\.repo" oar/core/release_discovery.py

Repository: openshift/release-tests

Length of output: 259


🏁 Script executed:

# 2. Check how ReleaseDiscovery is instantiated in mcp_server/server.py
cat -n mcp_server/server.py | head -200

Repository: openshift/release-tests

Length of output: 9356


🏁 Script executed:

# 3. Look at the full constructor and YAML handling in release_discovery.py
wc -l oar/core/release_discovery.py

Repository: openshift/release-tests

Length of output: 100


🏁 Script executed:

# 4. Check lines around 300-306 and 403-409 for YAML error handling
sed -n '75,90p' oar/core/release_discovery.py
sed -n '295,310p' oar/core/release_discovery.py
sed -n '398,415p' oar/core/release_discovery.py

Repository: openshift/release-tests

Length of output: 2568


🏁 Script executed:

# Search for ANY reference to self.repo in the entire release_discovery.py file
rg "self\.repo(?!_)" oar/core/release_discovery.py -n

Repository: openshift/release-tests

Length of output: 295


🏁 Script executed:

# Search for ReleaseDiscovery instantiation in mcp_server/server.py
rg "ReleaseDiscovery" mcp_server/server.py -n -A 5 -B 2

Repository: openshift/release-tests

Length of output: 1354


🏁 Script executed:

# Get the full discover-active-releases tool implementation
sed -n '1,462p' oar/core/release_discovery.py | tail -100

Repository: openshift/release-tests

Length of output: 4020


🏁 Script executed:

# Search for all occurrences of "repo" (case-insensitive) to catch any usage
rg "\.repo" oar/core/release_discovery.py -n

Repository: openshift/release-tests

Length of output: 268


🏁 Script executed:

# Look at the constructor and early method definitions more carefully
sed -n '75,120p' oar/core/release_discovery.py

Repository: openshift/release-tests

Length of output: 1844


🏁 Script executed:

# Check if self.repo might be used in any method
ast-grep --pattern 'def $METHOD($_) { $$$ self.repo $$$ }'

Repository: openshift/release-tests

Length of output: 49


🌐 Web query:

PyGithub Github.get_repo() REST API call behavior

💡 Result:

The PyGithub method Github.get_repo(full_name_or_id, lazy=False) makes a REST API call to the GitHub API to retrieve a Repository object. It calls GET /repos/{owner}/{repo} when provided with a full name string like "owner/repo", or GET /repositories/{id} when provided with a repository ID integer. This is documented in the official PyGithub documentation: "get_repo(full_name_or_id: int | str, lazy: bool = False) → Repository Calls GET /repos/{owner}/{repo} or GET /repositories/{id}" If lazy=True, it creates a partially initialized CompletableGithubObject. Accessing uninitialized attributes triggers an additional request to complete the object with all attributes. By default (lazy=False), it fetches the full repository details immediately upon calling get_repo. Examples throughout the docs show usage like repo = g.get_repo("PyGithub/PyGithub") followed by immediate access to attributes like repo.stargazers_count, repo.get_topics, etc., confirming the initial API call populates the object. The PyGithub library (latest v2.9.0 as of 2026-03-22) wraps the GitHub REST API v3, enabling typed interactions with resources like repositories.

Citations:


🏁 Script executed:

# Verify the ReleaseDiscovery constructor doesn't receive repo_name argument in mcp_server
sed -n '1315,1325p' mcp_server/server.py

Repository: openshift/release-tests

Length of output: 405


🏁 Script executed:

# Check ReleaseDiscovery.__init__ signature to see default values
sed -n '60,85p' oar/core/release_discovery.py

Repository: openshift/release-tests

Length of output: 1122


Remove the unused eager get_repo() call.

Line 83 creates a Repository object that is never used. Since PyGithub.get_repo() performs an immediate REST API call and mcp_server/server.py creates a fresh ReleaseDiscovery() per request (line 1320), every discovery request pays that latency and adds a failure point before the GraphQL path even starts, with no functional benefit.

Suggested change
         auth = Auth.Token(token)
         self._github = Github(auth=auth)
-        self.repo = self._github.get_repo(self.repo_name)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
auth = Auth.Token(token)
self._github = Github(auth=auth)
self.repo = self._github.get_repo(self.repo_name)
auth = Auth.Token(token)
self._github = Github(auth=auth)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@oar/core/release_discovery.py` around lines 81 - 83, Remove the eager REST
call to get_repo() that creates an unused Repository object: stop assigning
self.repo = self._github.get_repo(self.repo_name) in ReleaseDiscovery.__init__
so the code only constructs Auth.Token(token) and Github(auth=auth)
(self._github) and defers any repository lookup until/if needed via GraphQL;
ensure no other code expects self.repo to exist (update callers to use
self._github when they need to fetch repo lazily).

Comment on lines +300 to +306
try:
tracking_data[y_stream] = yaml.safe_load(yaml_text)
logger.debug(f"Parsed tracking file for y-stream {y_stream}")
except yaml.YAMLError as e:
raise ReleaseDiscoveryException(
f"Failed to parse tracking file for {y_stream}: {e}"
) from e
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Skip malformed YAML blobs instead of failing the entire batch.

These two parse blocks still abort discovery on the first bad tracking or StateBox file. One corrupted historical file would make the slash-command/dashboard path return no discoveries at all even though the other releases are still readable.

Suggested change
                             try:
                                 tracking_data[y_stream] = yaml.safe_load(yaml_text)
                                 logger.debug(f"Parsed tracking file for y-stream {y_stream}")
                             except yaml.YAMLError as e:
-                                raise ReleaseDiscoveryException(
-                                    f"Failed to parse tracking file for {y_stream}: {e}"
-                                ) from e
+                                logger.warning(f"Skipping malformed tracking file for {y_stream}: {e}")
@@
                     try:
                         statebox_data[release] = yaml.safe_load(yaml_text)
                         logger.debug(f"Parsed StateBox file for release {release}")
                     except yaml.YAMLError as e:
-                        raise ReleaseDiscoveryException(
-                            f"Failed to parse StateBox file for {release}: {e}"
-                        ) from e
+                        logger.warning(f"Skipping malformed StateBox file for {release}: {e}")

Also applies to: 403-409

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@oar/core/release_discovery.py` around lines 300 - 306, The parse blocks that
call yaml.safe_load (e.g., the one setting tracking_data[y_stream] and the
similar StateBox parsing around the later block) currently raise
ReleaseDiscoveryException on yaml.YAMLError; change both to swallow parsing
errors instead: wrap yaml.safe_load in a try/except, on success assign to
tracking_data[y_stream] (or the StateBox container) and logger.debug as before,
but on yaml.YAMLError log a warning/error with the y_stream/identifier and the
exception details and continue (do not raise); this preserves existing behavior
for valid blobs while skipping malformed YAML.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
oar/core/release_discovery.py (2)

81-83: ⚠️ Potential issue | 🟠 Major

Drop the unused eager get_repo() lookup.

self.repo is never read in this module, but Github.get_repo() adds a REST call and extra failure point. mcp_server/server.py, Lines 1315-1322 creates a fresh ReleaseDiscovery() per tool invocation, so every slash-command/dashboard request pays that cost before the GraphQL path even starts.

♻️ Proposed fix
         auth = Auth.Token(token)
         self._github = Github(auth=auth)
-        self.repo = self._github.get_repo(self.repo_name)
#!/bin/bash
# Verify that self.repo is only assigned here and a new ReleaseDiscovery
# instance is created for each MCP invocation.
rg -nC2 '\bself\.repo\b' oar/core/release_discovery.py
rg -nC3 'ReleaseDiscovery\(' mcp_server/server.py
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@oar/core/release_discovery.py` around lines 81 - 83, The current code eagerly
calls Github.get_repo() and assigns to self.repo (auth = Auth.Token(token);
self._github = Github(auth=auth); self.repo =
self._github.get_repo(self.repo_name)), which is unused and causes unnecessary
REST calls; remove the eager lookup by deleting the self.repo assignment and
only keep creating the Github client (self._github = Github(auth=auth)), or
defer get_repo() to where it is actually needed (e.g., callers that use
self._github can invoke self._github.get_repo(self.repo_name) when required);
ensure ReleaseDiscovery construction in mcp_server/server.py still works without
self.repo being present.

298-304: ⚠️ Potential issue | 🟠 Major

Skip malformed YAML blobs instead of failing the whole discovery run.

A single bad tracking or StateBox file still raises ReleaseDiscoveryException. mcp_server/server.py, Lines 1325-1364 converts that into {"releases": []}, so one corrupted blob blanks the slash command/dashboard result even when the other releases are fine.

♻️ Proposed fix
                         if yaml_text:
                             try:
-                                tracking_data[y_stream] = yaml.safe_load(yaml_text)
-                                logger.debug(f"Parsed tracking file for y-stream {y_stream}")
+                                parsed = yaml.safe_load(yaml_text)
                             except yaml.YAMLError as e:
-                                raise ReleaseDiscoveryException(
-                                    f"Failed to parse tracking file for {y_stream}: {e}"
-                                ) from e
+                                logger.warning(f"Skipping malformed tracking file for {y_stream}: {e}")
+                                break
+                            if not isinstance(parsed, dict):
+                                logger.warning(
+                                    f"Skipping tracking file for {y_stream}: expected mapping, got {type(parsed).__name__}"
+                                )
+                                break
+                            tracking_data[y_stream] = parsed
+                            logger.debug(f"Parsed tracking file for y-stream {y_stream}")
@@
                 if blob_object and blob_object.get("text"):
                     yaml_text = blob_object.get("text")
                     try:
-                        statebox_data[release] = yaml.safe_load(yaml_text)
-                        logger.debug(f"Parsed StateBox file for release {release}")
+                        parsed = yaml.safe_load(yaml_text)
                     except yaml.YAMLError as e:
-                        raise ReleaseDiscoveryException(
-                            f"Failed to parse StateBox file for {release}: {e}"
-                        ) from e
+                        logger.warning(f"Skipping malformed StateBox file for {release}: {e}")
+                        continue
+                    if not isinstance(parsed, dict):
+                        logger.warning(
+                            f"Skipping StateBox file for {release}: expected mapping, got {type(parsed).__name__}"
+                        )
+                        continue
+                    statebox_data[release] = parsed
+                    logger.debug(f"Parsed StateBox file for release {release}")

Also applies to: 395-401

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@oar/core/release_discovery.py` around lines 298 - 304, The code currently
raises ReleaseDiscoveryException when yaml.safe_load fails (the try/except
around tracking_data[y_stream] = yaml.safe_load(...) and the analogous block
around StateBox parsing), which aborts the whole discovery run; change both
handlers to catch yaml.YAMLError, log a warning with the y_stream (or StateBox
identifier) and the parsing error, and simply continue (skip that malformed
blob) instead of re-raising the exception so other releases are still processed.
Ensure you update the two catch sites (the block assigning
tracking_data[y_stream] and the similar StateBox parsing block) to use
logger.warning/debug with context and not raise ReleaseDiscoveryException.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@oar/core/release_discovery.py`:
- Around line 81-83: The current code eagerly calls Github.get_repo() and
assigns to self.repo (auth = Auth.Token(token); self._github =
Github(auth=auth); self.repo = self._github.get_repo(self.repo_name)), which is
unused and causes unnecessary REST calls; remove the eager lookup by deleting
the self.repo assignment and only keep creating the Github client (self._github
= Github(auth=auth)), or defer get_repo() to where it is actually needed (e.g.,
callers that use self._github can invoke self._github.get_repo(self.repo_name)
when required); ensure ReleaseDiscovery construction in mcp_server/server.py
still works without self.repo being present.
- Around line 298-304: The code currently raises ReleaseDiscoveryException when
yaml.safe_load fails (the try/except around tracking_data[y_stream] =
yaml.safe_load(...) and the analogous block around StateBox parsing), which
aborts the whole discovery run; change both handlers to catch yaml.YAMLError,
log a warning with the y_stream (or StateBox identifier) and the parsing error,
and simply continue (skip that malformed blob) instead of re-raising the
exception so other releases are still processed. Ensure you update the two catch
sites (the block assigning tracking_data[y_stream] and the similar StateBox
parsing block) to use logger.warning/debug with context and not raise
ReleaseDiscoveryException.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b3e1234d-aff6-4076-8b9e-366821aae6d1

📥 Commits

Reviewing files that changed from the base of the PR and between 12c57cd and 70abd94.

📒 Files selected for processing (1)
  • oar/core/release_discovery.py

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Apr 1, 2026

@LuboTerifaj: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants