Skip to content

Fixes to backend tests#2778

Open
Amoghrd wants to merge 1 commit into
netobserv:mainfrom
Amoghrd:fixes
Open

Fixes to backend tests#2778
Amoghrd wants to merge 1 commit into
netobserv:mainfrom
Amoghrd:fixes

Conversation

@Amoghrd
Copy link
Copy Markdown
Member

@Amoghrd Amoghrd commented May 20, 2026

Description

PR changes:

  • Update IDMS due to new PF5 builds
  • NETOBSERV-2680: Fix Kafka tests by updating API to v1
  • NETOBSERV-2681: Fix flowcollectorSlice flake by adding continous ping
  • Update TLS test to add metrics test and also rename TLSCurve to TLSGroup

Dependencies

n/a

Test Results

[sig-netobserv] Network_Observability with Loki with Kafka Author:aramesha-NonPreRelease-Longduration-Critical-56362-High-53597-High-56326-High-64880-High-75340-Verify network flows are captured with Kafka
with TLS [Serial][Slow]
• PASSED [783.614 seconds]

[sig-netobserv] Network_Observability with Loki with Kafka Author:aramesha-NonPreRelease-Longduration-High-57397-High-65116-Verify network-flows export with Kafka and netobserv installation without
Loki[Serial]
• PASSED [869.841 seconds]

[sig-netobserv] Network_Observability Author:aramesha-Critical-86388-Verify flowCollectorSlice collectionMode: AlwaysCollect [Serial]
• PASSED [663.264 seconds]

[sig-netobserv] Network_Observability Author:aramesha-Critical-86388-Verify flowCollectorSlice collectionMode: AllowList [Serial]
• PASSED [1047.535 seconds]

[sig-netobserv] Network_Observability with Loki Author:aramesha-NonPreRelease-High-88455-Verify TLS Tracking feature [Serial]
• PASSED [821.818 seconds]

Checklist

  • Does the changes in PR need specific configuration or environment set up for testing?
    • if so please describe it in PR description.
  • I have added thorough unit tests for the change.
  • QE requirements (check 1 from the list):
    • Standard QE validation, with pre-merge tests unless stated otherwise.
    • Regression tests only (e.g. refactoring with no user-facing change).
    • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Summary by CodeRabbit

  • New Features

    • Improved TLS flow tracking: flow logs and metrics now include TLS group information.
  • Tests

    • Enhanced TLS-related test validations and metrics checks; adjusted test timings and cleanup.
    • Minor test ordering and comment clarifications.
  • Chores

    • Upgraded Strimzi Kafka templates to API v1 and bumped broker version to 4.1.0.
    • Updated console plugin image mirror registry entries.

Review Change Stack

@Amoghrd Amoghrd requested a review from oliver-smakal May 20, 2026 17:53
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 20, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign mffiedler for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 20, 2026

📝 Walkthrough

Walkthrough

This PR updates integration tests to migrate TLS tracking from TLSCurve to TLSGroup (including a new TLS metrics verifier and test assertion changes), upgrades Strimzi Kafka templates to API v1 and broker version 4.1.0, changes test traffic to sustained 300s pings, and adds image mirror entries.

Changes

Integration Test Updates

Layer / File(s) Summary
TLS field and metrics tracking migration
integration-tests/backend/flowcollector.go, integration-tests/backend/metrics.go, integration-tests/backend/test_flowcollector.go
Flowlog replaces TLSCurve with TLSGroup. Added verifyTLSMetrics to query netobserv_namespace_tls_flows_total. TLS test assertions updated to assert TLSGroup is non-empty for TLS 1.3 flows; removed TLSCurve checks and adjusted related comments and cleanup.
Test traffic generation for cross-namespace flows
integration-tests/backend/test_flowcollectorslice.go
Scenario 4 traffic generator now uses timeout 300 ping <serverPodIP> instead of ping -c 100, and the post-ping wait before Loki verification reduced to 60s.
Strimzi Kafka API and version upgrades
integration-tests/backend/testdata/kafka/kafka-*.yaml, integration-tests/backend/test_flowcollector.go
Kafka/OpenShift template manifests updated from kafka.strimzi.io/v1beta2 to kafka.strimzi.io/v1. Kafka broker version bumped from 4.0.0 to 4.1.0 where present. Test deploy order adjusted to create KafkaNodePool before Kafka.
Image mirror configuration updates
integration-tests/backend/testdata/subscription/image-digest-mirror-set.yaml
Added network-observability-console-plugin-pf5 mirror entries and extended network-observability-console-plugin-pf4 to include pf4-zstream and switch source to the pf4-rhel9 variant.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive Title is vague and generic, using 'Fixes to backend tests' without specifying the main changes. Consider a more descriptive title that highlights the primary fixes, such as 'Update Kafka API version, TLS tracking, and test flakes' or similar.
✅ Passed checks (3 passed)
Check name Status Explanation
Description check ✅ Passed Description includes key context (ticket references, test results) but does not fully complete the template's description section and required information.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Amoghrd Amoghrd requested a review from memodi May 20, 2026 17:53
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@integration-tests/backend/test_flowcollector.go`:
- Line 2730: The test currently hard-codes an expectation that
r.Flowlog.TLSGroup contains "X25519", which is environment-dependent and flaky;
change the assertion in integration-tests/backend/test_flowcollector.go that
references r.Flowlog.TLSGroup (the
Expect(...).Should(ContainSubstring("X25519")) line) to instead assert a
non-empty value or membership in a small allowed set (e.g., check that
r.Flowlog.TLSGroup != "" or that it is one of ["X25519","secp256r1","secp384r1"]
using Expect(...).ShouldNot(o.BeEmpty()) or
Expect(allowedSet).To(ContainElement(r.Flowlog.TLSGroup))). Ensure you update
only that assertion so tests become environment-agnostic.

In `@integration-tests/backend/test_flowcollectorslice.go`:
- Around line 405-407: The test treats the `timeout 300 ping ...` exit status as
a failure (calling e2eoutput.RunHostCmd and then
Expect(err).NotTo(o.HaveOccurred())), but timeout intentionally exits non‑zero
when it stops pinging; update the RunHostCmd invocation to tolerate the non‑zero
exit by appending "|| true" (e.g., run "bash -c 'timeout 300 ping <ip> ||
true'") or otherwise ignore the timeout-induced exit code, and keep the Expect
check only for real execution errors rather than the timeout exit.

In `@integration-tests/backend/testdata/kafka/kafka-default.yaml`:
- Around line 48-49: The file sets metadataVersion: 4.0-IV3 while version:
4.1.0, which causes UnsupportedVersionException; update the metadataVersion to
match Kafka 4.1.x or remove the metadataVersion field so Strimzi will default to
the correct metadata for the declared cluster version (change the
metadataVersion entry or delete it near the existing metadataVersion and version
keys).

In `@integration-tests/backend/testdata/kafka/kafka-tls.yaml`:
- Around line 52-53: The metadataVersion currently set to "metadataVersion:
4.0-IV3" is incompatible with Kafka "version: 4.1.0"; update the metadataVersion
field in the kafka-tls.yaml (the metadataVersion key) to a 4.1 level such as
"4.1-IV1" or remove the metadataVersion line to let Strimzi choose the correct
default so inter-broker communication uses the proper metadata protocol for
Kafka 4.1.0.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 97f05324-c3d8-42bc-a9fe-907953176d71

📥 Commits

Reviewing files that changed from the base of the PR and between 52e076d and bf7a6a2.

📒 Files selected for processing (10)
  • integration-tests/backend/flowcollector.go
  • integration-tests/backend/metrics.go
  • integration-tests/backend/test_flowcollector.go
  • integration-tests/backend/test_flowcollectorslice.go
  • integration-tests/backend/testdata/kafka/kafka-default.yaml
  • integration-tests/backend/testdata/kafka/kafka-node-pool.yaml
  • integration-tests/backend/testdata/kafka/kafka-tls.yaml
  • integration-tests/backend/testdata/kafka/kafka-topic.yaml
  • integration-tests/backend/testdata/kafka/kafka-user.yaml
  • integration-tests/backend/testdata/subscription/image-digest-mirror-set.yaml

Comment thread integration-tests/backend/test_flowcollector.go Outdated
Comment thread integration-tests/backend/test_flowcollectorslice.go Outdated
Comment thread integration-tests/backend/testdata/kafka/kafka-default.yaml Outdated
Comment thread integration-tests/backend/testdata/kafka/kafka-tls.yaml Outdated
@Amoghrd Amoghrd added the needs-review Tells that the PR needs a review label May 20, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
integration-tests/backend/test_flowcollectorslice.go (1)

405-407: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Don’t suppress traffic-generation execution errors.

Line 406 currently drops all command failures. That can hide real exec/ping failures and make this scenario pass without guaranteed traffic.

Suggested fix
-		// Continuously ping server for 300 seconds to generate sustained traffic
-		_, _ = e2eoutput.RunHostCmd(testPingPodsTemplate.ClientNS, "ping-client", "timeout 300 ping "+serverPodIP)
+		// Continuously ping server for 300 seconds to generate sustained traffic
+		_, err = e2eoutput.RunHostCmd(testPingPodsTemplate.ClientNS, "ping-client", "ping -w 300 "+serverPodIP)
+		o.Expect(err).NotTo(o.HaveOccurred())
 		time.Sleep(60 * time.Second)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/test_flowcollectorslice.go` around lines 405 - 407,
Do not discard the RunHostCmd result: replace the ignored return with capturing
both output and error from e2eoutput.RunHostCmd(testPingPodsTemplate.ClientNS,
"ping-client", "timeout 300 ping "+serverPodIP) and fail the test on error
(e.g., t.Fatalf or equivalent) while including both the error and command output
in the message so real exec/ping failures are visible; keep the subsequent
time.Sleep(60 * time.Second).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@integration-tests/backend/test_flowcollectorslice.go`:
- Around line 405-407: Do not discard the RunHostCmd result: replace the ignored
return with capturing both output and error from
e2eoutput.RunHostCmd(testPingPodsTemplate.ClientNS, "ping-client", "timeout 300
ping "+serverPodIP) and fail the test on error (e.g., t.Fatalf or equivalent)
while including both the error and command output in the message so real
exec/ping failures are visible; keep the subsequent time.Sleep(60 *
time.Second).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8dd3af5f-df7b-4d51-9fe4-cadc9cf16caa

📥 Commits

Reviewing files that changed from the base of the PR and between bf7a6a2 and 5d17135.

📒 Files selected for processing (10)
  • integration-tests/backend/flowcollector.go
  • integration-tests/backend/metrics.go
  • integration-tests/backend/test_flowcollector.go
  • integration-tests/backend/test_flowcollectorslice.go
  • integration-tests/backend/testdata/kafka/kafka-default.yaml
  • integration-tests/backend/testdata/kafka/kafka-node-pool.yaml
  • integration-tests/backend/testdata/kafka/kafka-tls.yaml
  • integration-tests/backend/testdata/kafka/kafka-topic.yaml
  • integration-tests/backend/testdata/kafka/kafka-user.yaml
  • integration-tests/backend/testdata/subscription/image-digest-mirror-set.yaml
✅ Files skipped from review due to trivial changes (2)
  • integration-tests/backend/testdata/kafka/kafka-node-pool.yaml
  • integration-tests/backend/testdata/kafka/kafka-user.yaml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-review Tells that the PR needs a review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant