Skip to content

NETOBSERV-2660 Add server performance benchmark improvements#1444

Open
Amoghrd wants to merge 1 commit into
netobserv:mainfrom
Amoghrd:server-perf-followup
Open

NETOBSERV-2660 Add server performance benchmark improvements#1444
Amoghrd wants to merge 1 commit into
netobserv:mainfrom
Amoghrd:server-perf-followup

Conversation

@Amoghrd
Copy link
Copy Markdown
Member

@Amoghrd Amoghrd commented Apr 20, 2026

Description

Implement future improvements listed in BENCHMARKS.md:

  • Add BenchmarkExport for CSV export flows endpoint
  • Add BenchmarkLargeResultSets to test 100, 1K, 5K, 10K records
  • Add BenchmarkFilterHeavy for complex filter combinations
  • Add BenchmarkConcurrentTableView and BenchmarkConcurrentTopologyMetrics for parallel load testing
  • Update BENCHMARKS.md documentation with new scenarios and coverage

All benchmarks run successfully individually or in small groups. The concurrent benchmarks are separated to avoid macOS port exhaustion when running the full suite.

Dependencies

n/a

Checklist

  • Does the changes in PR need specific configuration or environment set up for testing?
    • if so please describe it in PR description.
  • I have added thorough unit tests for the change.
  • QE requirements (check 1 from the list):
    • Standard QE validation, with pre-merge tests unless stated otherwise.
    • Regression tests only (e.g. refactoring with no user-facing change).
    • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Summary by CodeRabbit

  • Tests

    • Added comprehensive benchmark tests for server performance across various scenarios (overview, table, topology, export)
    • Included benchmarks for filter-heavy queries, large result sets, concurrent load, and aggregation variations
  • Documentation

    • Updated performance testing documentation with explicit benchmark targets and scenario descriptions
  • Chores

    • Added convenience Make targets for running specific benchmark suites

Review Change Stack

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 20, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign stleerh for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Amoghrd Amoghrd added no-qe This PR doesn't necessitate QE approval no-doc This PR doesn't require documentation change on the NetObserv operator labels Apr 20, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 20, 2026

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description check ✅ Passed Description adequately covers the changes, dependencies, and QE requirements. The checklist is properly filled with 'No QE' selected and a rationale provided for the improvements implemented.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title clearly summarizes the main change: adding server performance benchmark improvements, which aligns with the PR's objective of implementing benchmark enhancements across multiple test files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Amoghrd Amoghrd force-pushed the server-perf-followup branch 3 times, most recently from 8152d50 to 324e90a Compare April 20, 2026 20:29
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/server/server_perf_test.go (1)

21-27: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Handle json.Marshal failures in benchmark setup.

Line 21 and Line 46 ignore marshal errors. If fixture types change, setup can silently break and skew benchmark behavior.

Suggested fix
-	matrixResponse, _ := json.Marshal(model.QueryResponse{
+	matrixResponse, err := json.Marshal(model.QueryResponse{
 		Status: "",
 		Data: model.QueryResponseData{
 			ResultType: model.ResultTypeMatrix,
 			Result:     model.Matrix{},
 		},
 	})
+	if err != nil {
+		panic("failed to marshal matrix benchmark response: " + err.Error())
+	}
...
-	streamResponse, _ := json.Marshal(model.QueryResponse{
+	streamResponse, err := json.Marshal(model.QueryResponse{
 		Status: "",
 		Data: model.QueryResponseData{
 			ResultType: model.ResultTypeStream,
 			Result:     streams,
 		},
 	})
+	if err != nil {
+		panic("failed to marshal stream benchmark response: " + err.Error())
+	}

As per coding guidelines, **/*.{ts,tsx,go}: Handle errors with proper error messages and consistent error handling patterns.

Also applies to: 46-52

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/server/server_perf_test.go` around lines 21 - 27, The json.Marshal calls
that create matrixResponse (and the similar streamResponse block) currently
ignore errors; update both marshals to capture the error and fail fast with a
clear message (e.g., if err != nil { log.Fatalf("marshal %s response: %v",
"matrix"/"stream", err) } or panic/ t.Fatalf in test context) so setup doesn't
silently proceed when fixture types change; modify the code around the
matrixResponse and the streamResponse marshal sites to check err and include the
actual error in the fatal message.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/server/server_perf_overview_test.go`:
- Around line 121-122: The test currently ignores errors returned by
http.NewRequest (e.g., the line creating req from
backendSvc.URL+"/api/flow/metrics?"+params); update both occurrences to capture
and handle the error (err := http.NewRequest(...)), and fail the test
deterministically (e.g., t.Fatalf or require.NoError) with a clear message if
err != nil before calling client.Do(req); ensure the error handling uses the
same pattern both where req is created and where the second request is built so
failures are explicit and triageable.

In `@pkg/server/server_perf_test.go`:
- Around line 143-145: The code currently ignores errors returned by
http.NewRequest in runFlowRecordsQuery (and the other benchmark helpers around
the same file), so update each call to check the error, log or fail the
test/benchmark immediately with clear endpoint context (include the URL/query
string and function name) and return/stop the benchmark—e.g., in
runFlowRecordsQuery and the other helper functions that call http.NewRequest,
replace the discarded error with proper error handling that fails fast (t.Fatalf
or b.Fatalf or equivalent) and includes the endpoint being requested for
actionable diagnostics.

In `@pkg/server/server_perf_topology_test.go`:
- Around line 64-65: The test currently ignores errors returned by
http.NewRequest; update both occurrences where req, _ := http.NewRequest(...) is
used to check the error and fail immediately with context (e.g. if err != nil {
b.Fatalf("creating request to %s failed: %v",
backendSvc.URL+"/api/flow/metrics?"+params, err) } or use t.Fatalf if inside a
unit test), so replace the blank identifier with err, validate err, and include
the request URL/params and the error in the Fatalf message; apply the same
change for the second occurrence around the client.Do call as well.

---

Outside diff comments:
In `@pkg/server/server_perf_test.go`:
- Around line 21-27: The json.Marshal calls that create matrixResponse (and the
similar streamResponse block) currently ignore errors; update both marshals to
capture the error and fail fast with a clear message (e.g., if err != nil {
log.Fatalf("marshal %s response: %v", "matrix"/"stream", err) } or panic/
t.Fatalf in test context) so setup doesn't silently proceed when fixture types
change; modify the code around the matrixResponse and the streamResponse marshal
sites to check err and include the actual error in the fatal message.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2e51b39e-0e5d-48b2-bc83-8c6677959a6a

📥 Commits

Reviewing files that changed from the base of the PR and between 8cfc3b1 and 31d022e.

📒 Files selected for processing (6)
  • Makefile
  • pkg/server/BENCHMARKS.md
  • pkg/server/server_perf_overview_test.go
  • pkg/server/server_perf_table_test.go
  • pkg/server/server_perf_test.go
  • pkg/server/server_perf_topology_test.go

Comment thread pkg/server/server_perf_overview_test.go Outdated
Comment thread pkg/server/server_perf_test.go
Comment thread pkg/server/server_perf_topology_test.go Outdated
Adds comprehensive Go benchmarks for server performance testing across
all major views (Overview, Topology, Table) and scenarios.

## New Benchmark Files

- server_perf_overview_test.go - Overview page scenarios (Basic, DNS, RTT, Dropped, Full)
- server_perf_topology_test.go - Topology view with all scopes (cluster, zone, host, namespace, owner, resource)
- server_perf_table_test.go - Table view with varying result sizes (100, 1K, 10K records)
- server_perf_test.go - Shared helpers and export benchmarks

## Test Coverage

**Overview Tab:**
- Basic metrics (namespace/app aggregation for Bytes/Packets)
- DNS latency scenarios (avg, p90, by name/code/app)
- RTT scenarios (min, avg, p90)
- Packet drop scenarios (by state/cause)
- Full scenario (all features enabled - worst case)
- Filter complexity (1, 2, 4 filters)
- Concurrent user load

**Topology Tab:**
- All metric types (Bytes, Packets, DNS, RTT, Dropped)
- BytesWithDrops combined scenario
- All 6 scopes (cluster, zone, host, namespace, owner, resource)
- Filter complexity (1, 2, 4 filters)
- Concurrent user load

**Table Tab:**
- Basic table view
- With histogram (2 queries)
- Large result sets (100, 1K, 10K records)
- Filter complexity (1, 2, 4 filters)
- Concurrent user load

**Export:**
- Basic CSV export
- With filters
- With limit

## Code Quality Improvements

- Constants for HTTP client connection pool values
- Named filterTest type for clarity
- Comprehensive error handling for json.Marshal and http.NewRequest
- Defensive nil checks for response body cleanup
- Shared helper functions (cleanupBenchmarkServers, runMetricsQuery, etc.)
- DRY filter test data via getCommonFilterTests()
- Optimized timestamp generation (reduced 10K calls to 1 per benchmark)

## Baseline File

Updated benchmark-baseline.txt with CI results for regression detection.
All benchmarks run with AMD EPYC 9V74 80-Core Processor baseline.

## Documentation

Added pkg/server/BENCHMARKS.md with:
- Quick start guide
- Detailed benchmark descriptions
- Performance baseline interpretation
- CI integration instructions

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@Amoghrd Amoghrd force-pushed the server-perf-followup branch from c0728a7 to af74df8 Compare May 11, 2026 20:59
@Amoghrd Amoghrd changed the title WIP: NETOBSERV-2660 Add server performance benchmark improvements NETOBSERV-2660 Add server performance benchmark improvements May 11, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 52.42%. Comparing base (8cfc3b1) to head (af74df8).
⚠️ Report is 21 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1444      +/-   ##
==========================================
- Coverage   52.53%   52.42%   -0.12%     
==========================================
  Files         232      233       +1     
  Lines       12355    12452      +97     
  Branches     1551     1564      +13     
==========================================
+ Hits         6491     6528      +37     
- Misses       5269     5310      +41     
- Partials      595      614      +19     
Flag Coverage Δ
uitests 56.54% <ø> (-0.20%) ⬇️
unittests 40.96% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.
see 41 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no-doc This PR doesn't require documentation change on the NetObserv operator no-qe This PR doesn't necessitate QE approval

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant