WINC-1778: [ote] Migrate networking tests to OTE (batch 2/6)#3922
WINC-1778: [ote] Migrate networking tests to OTE (batch 2/6)#3922weinliu wants to merge 1 commit intoopenshift:masterfrom
Conversation
|
@weinliu: This pull request references WINC-1778 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository YAML (base), Organization UI (inherited) Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThe pull request introduces Windows networking extension tests to the WMCO test suite. A new batch of three extension test specs was registered in the test command module. These specs invoke corresponding extended networking checks that validate Windows pod communication patterns. A new networking check implementation file was created with three exported validation functions: one checking east-west pod-to-pod connectivity between Windows and Linux workloads, one validating external networking with LoadBalancer services, and one performing comprehensive CNI and internal networking validation across multiple connectivity paths including same-node and cross-node scenarios. 🚥 Pre-merge checks | ✅ 2 | ❌ 3❌ Failed checks (2 warnings, 1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@weinliu: This pull request references WINC-1778 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: weinliu The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@weinliu: This pull request references WINC-1778 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
ote/test/extended/networking.go (1)
252-254: Minor: Error wrapping with potentially nil error.When
out == ""buterr == nil, wrapping anilerror with%wproduces%!w(<nil>)in the message. This pattern repeats ingetPodIPs,getPodHostIPs, andgetServiceClusterIP. Not a blocker since the message still conveys the failure, but consider explicitly constructing the error without%wwhenerris nil for cleaner diagnostics.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ote/test/extended/networking.go` around lines 252 - 254, The error message is wrapping a possibly nil error with %w (e.g., in the pod lookup code and the functions getPodIPs, getPodHostIPs, getServiceClusterIP), which renders as %!w(<nil>); change the return logic to check if err == nil and construct a plain fmt.Errorf("no pods found for %s in %s", deployment, namespace) (or similar) in that branch, otherwise return fmt.Errorf("no pods found for %s in %s: %w", deployment, namespace, err) to preserve the original error when present.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@ote/test/extended/networking.go`:
- Around line 414-418: The comment above the pod exec is inaccurate: the code
calls getPodNames(oc, windowsWorkloads, namespace) and executes
oc.AsAdmin().Run("exec", ... pods[0], "--", "curl" ...) against a Windows pod,
not a Linux pod; update the comment to state that the curl is performed from a
Windows pod (or remove the OS-specific note) so it matches getPodNames,
windowsWorkloads and the oc.AsAdmin().Run exec usage.
- Around line 502-514: The DNS names for the services are hardcoded to the wrong
namespace (windowsServiceDNS, linuxServiceDNS) causing failures in
CheckCNIAndInternalNetworking; replace those constants with dynamic DNS strings
built from the local namespace variable (e.g., construct
"win-webserver.%s.svc.cluster.local" and
"linux-webserver.%s.svc.cluster.local:8080" using namespace) and use those new
winDNS/linuxDNS values in the two Output calls so the curl/pwsh requests target
the deployed namespace rather than the hardcoded winc-test.
---
Nitpick comments:
In `@ote/test/extended/networking.go`:
- Around line 252-254: The error message is wrapping a possibly nil error with
%w (e.g., in the pod lookup code and the functions getPodIPs, getPodHostIPs,
getServiceClusterIP), which renders as %!w(<nil>); change the return logic to
check if err == nil and construct a plain fmt.Errorf("no pods found for %s in
%s", deployment, namespace) (or similar) in that branch, otherwise return
fmt.Errorf("no pods found for %s in %s: %w", deployment, namespace, err) to
preserve the original error when present.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: a70ffb78-fded-4a7c-bb65-b6958be80356
📒 Files selected for processing (2)
ote/cmd/wmco-tests-ext/main.goote/test/extended/networking.go
ote/test/extended/networking.go
Outdated
| // Use oc exec from a Linux pod to curl the LB since we may not have direct access | ||
| pods, err := getPodNames(oc, windowsWorkloads, namespace) | ||
| if err == nil && len(pods) > 0 { | ||
| msg, _, _ := oc.AsAdmin().Run("exec", "-n", namespace, pods[0], | ||
| "--", "curl", "--max-time", "10", "http://"+endpoint) |
There was a problem hiding this comment.
Misleading comment: code uses Windows pod, not Linux.
The comment says "Use oc exec from a Linux pod" but pods is fetched from windowsWorkloads - you're curling from the Windows pod. This works fine (Windows can curl its own LB endpoint), but the comment should be corrected to avoid confusion during future maintenance.
📝 Proposed fix
- // Use oc exec from a Linux pod to curl the LB since we may not have direct access
+ // Use oc exec from a Windows pod to curl the LB since we may not have direct access📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Use oc exec from a Linux pod to curl the LB since we may not have direct access | |
| pods, err := getPodNames(oc, windowsWorkloads, namespace) | |
| if err == nil && len(pods) > 0 { | |
| msg, _, _ := oc.AsAdmin().Run("exec", "-n", namespace, pods[0], | |
| "--", "curl", "--max-time", "10", "http://"+endpoint) | |
| // Use oc exec from a Windows pod to curl the LB since we may not have direct access | |
| pods, err := getPodNames(oc, windowsWorkloads, namespace) | |
| if err == nil && len(pods) > 0 { | |
| msg, _, _ := oc.AsAdmin().Run("exec", "-n", namespace, pods[0], | |
| "--", "curl", "--max-time", "10", "http://"+endpoint) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ote/test/extended/networking.go` around lines 414 - 418, The comment above
the pod exec is inaccurate: the code calls getPodNames(oc, windowsWorkloads,
namespace) and executes oc.AsAdmin().Run("exec", ... pods[0], "--", "curl" ...)
against a Windows pod, not a Linux pod; update the comment to state that the
curl is performed from a Windows pod (or remove the OS-specific note) so it
matches getPodNames, windowsWorkloads and the oc.AsAdmin().Run exec usage.
ote/test/extended/networking.go
Outdated
| // Linux → Windows via DNS | ||
| msg, err = oc.AsAdmin().Output("exec", "-n", namespace, linuxPods[0], | ||
| "--", "curl", windowsServiceDNS) | ||
| if err != nil || !strings.Contains(msg, "Windows Container Web Server") { | ||
| return fmt.Errorf("Linux pod cannot reach Windows pod via DNS: %s", msg) | ||
| } | ||
|
|
||
| // Windows → Linux via DNS | ||
| msg, err = oc.AsAdmin().Output("exec", "-n", namespace, winPods[0], | ||
| "--", "pwsh.exe", "-Command", psInvokeWebRequest("http://"+linuxServiceDNS)) | ||
| if err != nil || !strings.Contains(msg, "Linux Container Web Server") { | ||
| return fmt.Errorf("Windows pod cannot reach Linux pod via DNS: %s", msg) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify if the constants are used anywhere else that might justify keeping them
rg -n "windowsServiceDNS|linuxServiceDNS" --type goRepository: openshift/windows-machine-config-operator
Length of output: 477
🏁 Script executed:
# Get the function containing lines 502-514 and see the namespace being used
sed -n '428,527p' ote/test/extended/networking.go | cat -nRepository: openshift/windows-machine-config-operator
Length of output: 4535
Critical: DNS hardcoding to wrong namespace breaks OCP-31276 test.
The constants windowsServiceDNS and linuxServiceDNS are hardcoded to winc-test, but CheckCNIAndInternalNetworking deploys workloads in winc-31276. DNS queries fail because win-webserver.winc-test.svc.cluster.local doesn't exist—the service is at win-webserver.winc-31276.svc.cluster.local.
Construct DNS names dynamically using the namespace variable:
winDNS := fmt.Sprintf("win-webserver.%s.svc.cluster.local", namespace)
linuxDNS := fmt.Sprintf("linux-webserver.%s.svc.cluster.local:8080", namespace)These constants are only used here (no other callers), so removing them is safe.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ote/test/extended/networking.go` around lines 502 - 514, The DNS names for
the services are hardcoded to the wrong namespace (windowsServiceDNS,
linuxServiceDNS) causing failures in CheckCNIAndInternalNetworking; replace
those constants with dynamic DNS strings built from the local namespace variable
(e.g., construct "win-webserver.%s.svc.cluster.local" and
"linux-webserver.%s.svc.cluster.local:8080" using namespace) and use those new
winDNS/linuxDNS values in the two Output calls so the curl/pwsh requests target
the deployed namespace rather than the hardcoded winc-test.
|
/retest lint |
|
/retest security |
|
@weinliu: The Use DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@weinliu: The Use DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Migrate the following networking smokerun tests from openshift-tests-private to the OTE extension binary: - OCP-28632: Windows and Linux east-west network (uses pre-existing pods) - OCP-32273: Configure kube proxy and external networking (LoadBalancer) - OCP-31276: Configure CNI and internal networking (pod-to-pod + DNS) OCP-39451 (ClusterIP with MachineSet scaling) is excluded from this batch due to its long runtime and disruptive nature ([Slow][Disruptive]).
8ffea47 to
f6ef066
Compare
|
@weinliu: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@weinliu: This pull request references WINC-1778 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Closing, please open a separate request if this work is still required. |
Summary
Migrates the Networking batch of WINC tests as part of WINC-1778.
Batch 2 — Networking (3 tests):
Note: OCP-39451 (ClusterIP with MachineSet scaling) is excluded from this batch due to its long runtime and disruptive nature.
Planned migration batches (WINC-1778):
Notes
win-webserverandlinux-webserverpods in thewinc-testnamespace (provisioned by Flexy/cluster setup)Summary by CodeRabbit