Skip to content

fix(api-core): repair a silently-passing state assertion and the ignored rack tests#2716

Merged
chet merged 1 commit into
NVIDIA:mainfrom
chet:gh-issue-2694
Jun 21, 2026
Merged

fix(api-core): repair a silently-passing state assertion and the ignored rack tests#2716
chet merged 1 commit into
NVIDIA:mainfrom
chet:gh-issue-2694

Conversation

@chet

@chet chet commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Wave 1 of the carbide-test-support integration (#2692).

Two real test-coverage holes in api-core:

  • machine_states::test_state_outcome asserted matches!(host_machine.current_state(), _expected_state) -- an irrefutable binding, so it passed for any state and tested nothing. Now asserts equality against the expected_state value it builds; stray dbg! removed.
  • Four rack state-controller tests were bare #[ignore] (never ran in CI). test_expected_zero_topology_transitions_to_discovering was only broken by a duplicate db_rack::create in its setup (the second call, mislabeled "simulating linked trays," just re-created the rack) -- fixed and re-enabled.

The other three formerly-ignored tests assert Created/Discovering transitions the handler does not currently make. Rather than leave them silently dead, each now carries an #[ignore = "..."] reason and is tracked in #2715 (handler-fix vs stale-test, per test). This converts 4 invisible dead tests into 1 live test + 3 documented-and-tracked ones.

Verified: rack-handler + machine-states tests run green (44 pass, 3 explicitly ignored, 0 fail).

Addresses #2694. Surfaced follow-up: #2715.

…red rack tests

`test_state_outcome` asserted `matches!(host_machine.current_state(),
_expected_state)`. In pattern position `_expected_state` is an irrefutable binding,
not a comparison against the value built just above it -- so the assertion passed
for any state and tested nothing. It now asserts equality against that value, and
a stray `dbg!` is removed.

The four rack state-controller tests were bare `#[ignore]`, so they never ran in
CI. `test_expected_zero_topology_transitions_to_discovering` was only broken by a
duplicate `db_rack::create` in its setup -- the second call, mislabeled as
"simulating linked trays," just re-created the same rack -- so with that removed
it passes and is re-enabled. The other three assert `Created`/`Discovering`
transitions the handler does not currently make; rather than leave them silently
dead, each now carries an `#[ignore = "..."]` reason and is tracked in NVIDIA#2715 to
decide handler-fix versus stale-test per case.

The rack-handler and machine-states tests run green: 44 pass, 3 explicitly
ignored (down from 4 silently ignored), 0 fail.

Signed-off-by: Chet Nichols III <chetn@nvidia.com>
@chet

chet commented Jun 20, 2026

Copy link
Copy Markdown
Contributor Author

@coderabbitai PTAL, thanks!

@copy-pr-bot

copy-pr-bot Bot commented Jun 20, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 564b309c-3248-4d8a-b38f-30f78c86e5a0

📥 Commits

Reviewing files that changed from the base of the PR and between 79e54c0 and ae98652.

📒 Files selected for processing (2)
  • crates/api-core/src/tests/machine_states.rs
  • crates/api-core/src/tests/rack_state_controller/handler.rs

Summary by CodeRabbit

  • Tests
    • Improved test assertions and removed unnecessary debug logging for better test clarity.
    • Re-enabled a previously inactive test and updated test documentation with explicit ignore reasons.

Walkthrough

Two test files are updated: machine_states.rs replaces a weak matches! assertion with a strict assert_eq! and removes a dbg! debug call in test_state_outcome; rack_state_controller/handler.rs adds explicit issue-linked #[ignore] reasons to three tests and re-enables one previously ignored test.

Changes

Test Assertion Hardening and Ignore Annotation Cleanup

Layer / File(s) Summary
Harden test_state_outcome assertion and remove dbg!
crates/api-core/src/tests/machine_states.rs
Replaces the always-passing matches!(…, _expected_state) assertion with a proper assert_eq!(host_machine.current_state(), &expected_state, …) with an explicit failure message; removes the stray dbg!(&outcome) debug print.
Rack state controller ignore annotation updates
crates/api-core/src/tests/rack_state_controller/handler.rs
Adds #[ignore = "…#2715"] with explicit reasons to three tests (test_expected_incomplete_device_counts_stays, test_expected_more_discovered_than_expected_transitions, test_discovering_waits_for_compute_ready); re-enables test_expected_zero_topology_transitions_to_discovering by removing its #[ignore] attribute; updates an inline comment to reference the "Empty" rack profile.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related issues

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the primary changes: fixing a defective assertion in test_state_outcome and remediating four previously-ignored rack state-controller tests.
Description check ✅ Passed The description provides comprehensive context for both fixes, clearly explaining the assertion defect, the duplicate db_rack::create issue, and the strategy for handling the other three tests.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai

coderabbitai Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Fetch PR details
gh pr view 2716 --repo NVIDIA/infra-controller --json title,body,files,additions,deletions

echo "---FILES CHANGED---"
gh pr diff 2716 --repo NVIDIA/infra-controller

Length of output: 6387



@chet: The changes are clean, well-scoped, and address two legitimate test-infrastructure defects. Here is my assessment:


crates/api-core/src/tests/machine_states.rs

Critical fix — irrefutable pattern. matches!(host_machine.current_state(), _expected_state) used an underscore-prefixed binding, which is an irrefutable pattern; the assertion was equivalent to true regardless of the machine's actual state. Replacing it with:

assert_eq!(
    host_machine.current_state(),
    &expected_state,
    "machine should be in DPUInit, waiting for network config"
);

is precisely correct. The reference (&expected_state) is consistent with current_state() returning a reference, and the diagnostic message provides immediate actionable context on failure.

Removal of the dbg! is appropriate.


crates/api-core/src/tests/rack_state_controller/handler.rs

test_expected_zero_topology_transitions_to_discovering (re-enabled). The duplicate db_rack::create on the same rack_id was the sole defect. Removing the first call (mislabeled "2 compute" for what is explicitly a zero-topology scenario) and retaining the correct RackProfileId::new("Empty") + RackConfig::default() call — with a clarified comment — is a minimal, correct fix. No behavioral change to the test logic itself.

Three remaining ignored tests. Promoting bare #[ignore] to #[ignore = "asserts rack state-transition behavior the handler does not currently exhibit; tracked in #2715"] is the right engineering trade-off: the tests document an unmet contract, they are visible in cargo test -- --list, and #2715 provides the tracking path for resolution. This is materially better than silent dead tests.


Summary: No concerns. The fixes are precise, the test documentation is production-grade, and the scope is appropriately narrow. LGTM. ✅

[approve]

@chet chet marked this pull request as ready for review June 21, 2026 04:17
@chet chet requested a review from a team as a code owner June 21, 2026 04:17
@github-actions

Copy link
Copy Markdown

🔍 Container Scan Summary

Service Total Critical High Medium Low Other
boot-artifacts-aarch64 3 0 0 3 0 0
boot-artifacts-x86_64 3 0 0 3 0 0
forge-admin-cli-x86_64 264 6 23 99 6 130
machine-validation-runner 704 34 183 258 35 194
machine_validation 704 34 183 258 35 194
nvmetal-carbide 704 34 183 258 35 194
TOTAL 2382 108 572 879 111 712

Per-CVE detail lives in the per-service grype-* artifacts (JSON + SARIF). Severity counts only — no CVE IDs published here.

@chet chet merged commit 7af2aac into NVIDIA:main Jun 21, 2026
55 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants