[CCXDEV-16390] [Proposal] Add KinD scripts for running DVO e2e tests locally#676
[CCXDEV-16390] [Proposal] Add KinD scripts for running DVO e2e tests locally#676ikerreyes wants to merge 1 commit into
Conversation
Add scripts to create a KinD cluster, deploy the DVO operator, and run the existing e2e test suite without requiring a full OpenShift cluster. - kind/config.yaml: KinD cluster config with NodePort mapping for metrics - kind/setup.sh: creates cluster, builds and loads images, deploys DVO - kind/run-tests.sh: runs tests from the test container image - kind/teardown.sh: deletes the cluster - kind/README.md: usage and configuration documentation Usage: ./kind/setup.sh ./kind/run-tests.sh ./kind/teardown.sh Assisted-by: Claude Opus 4.6
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: ikerreyes The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @ikerreyes. Thanks for your PR. I'm waiting for a app-sre member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
📝 WalkthroughWalkthroughThis PR introduces KinD (Kubernetes-in-Docker) infrastructure and scripting to enable local end-to-end testing of the deployment-validation-operator without an OpenShift cluster. It includes cluster configuration, provisioning, test execution, teardown scripts, and supporting documentation. ChangesKinD Local Testing Infrastructure
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 60 minutes.Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
kind/run-tests.sh (1)
8-8: ⚡ Quick win
DVO_METRICS_URLis hardcoded and not env-overridable.All other configurable values in this script are derived from environment variables with defaults (e.g.,
CLUSTER_NAME,TEST_IMAGE,HTTPD_IMAGE).DVO_METRICS_URLis the sole exception. If someone changeshostPortinkind/config.yaml(e.g., because8383is already in use), they must also edit this script.♻️ Proposed fix
-DVO_METRICS_URL="http://localhost:8383" +DVO_METRICS_URL="${DVO_METRICS_URL:-http://localhost:8383}"Also add a row to the README env-var table:
| `DVO_METRICS_URL` | `http://localhost:8383` | Metrics endpoint for the test suite |🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@kind/run-tests.sh` at line 8, Replace the hardcoded DVO_METRICS_URL assignment in kind/run-tests.sh with an environment-overridable default (use the existing pattern used for CLUSTER_NAME/TEST_IMAGE/HTTPD_IMAGE so DVO_METRICS_URL can be overridden when needed) and update any references to use that variable (look for DVO_METRICS_URL in the script). Also add a row to the README env-var table documenting DVO_METRICS_URL with default "http://localhost:8383" and a short description like "Metrics endpoint for the test suite".kind/setup.sh (1)
79-91: Consider folding the"2m"timeout substitution into the Python YAML transform.The sed pattern
s|value: "2m"|value: "10s"|(line 81) is unanchored — it relies on a single exact string match rather than targeting a specific field. While only one env var (VALIDATION_CHECK_INTERVAL) currently holds this value, the pattern remains fragile for future maintenance. Since the Python YAML transform already processes the document, the change can be done precisely there by targeting the specific env var by name.♻️ Proposed refactor — move timeout change into the Python block
sed -e "s|quay.io/deployment-validation-operator/dv-operator:latest|${DVO_IMAGE}|" \ -e "s|imagePullPolicy: Always|imagePullPolicy: IfNotPresent|" \ - -e 's|value: "2m"|value: "10s"|' \ "${DEPLOY_DIR}/operator.yaml" | \ python3 -c " import sys, yaml doc = yaml.safe_load(sys.stdin) spec = doc['spec']['template']['spec'] spec.pop('nodeSelector', None) spec.pop('tolerations', None) spec.pop('affinity', None) +for env in spec.get('containers', [{}])[0].get('env', []): + if env.get('name') == 'VALIDATION_CHECK_INTERVAL' and env.get('value') == '2m': + env['value'] = '10s' yaml.dump(doc, sys.stdout, default_flow_style=False) " | kubectl apply -f - -n "${DVO_NAMESPACE}"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@kind/setup.sh` around lines 79 - 91, The sed-based substitution for value: "2m" is brittle; remove the sed pattern and instead update the Python YAML transform to find the container env list in the document and change the env var named VALIDATION_CHECK_INTERVAL from "2m" to "10s" (e.g., locate doc['spec']['template']['spec'] → containers → env and mutate the entry where env['name']=="VALIDATION_CHECK_INTERVAL"). Also keep the other edits (image and pullPolicy) in sed, but delete the unanchored s|value: "2m"|...|; ensure the Python block writes the modified YAML to stdout and preserves existing spec.pop(...) removals (nodeSelector, tolerations, affinity).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@kind/run-tests.sh`:
- Around line 53-69: The temporary kubeconfig created by KUBECONFIG_TMP via
mktemp can leak if any command (kind get kubeconfig, chmod, or the
${CONTAINER_ENGINE} run) fails; add a trap to ensure cleanup on any exit.
Specifically, define KUBECONFIG_TMP="$(mktemp)", then immediately register a
trap like trap 'rm -f "${KUBECONFIG_TMP}"' EXIT (or a function that removes the
file) before calling kind get kubeconfig and before running ${CONTAINER_ENGINE}
run, so rm -f "${KUBECONFIG_TMP}" is executed on normal exit and on errors; keep
the existing final rm -f as a no-op fallback.
In `@kind/teardown.sh`:
- Around line 6-8: The teardown script currently calls kind delete cluster
--name "${CLUSTER_NAME}" unconditionally which fails if the cluster doesn't
exist; modify teardown.sh to first check for the cluster's existence (e.g.,
using kind get clusters or kind get cluster --name "${CLUSTER_NAME}" and test
its exit/status) and only run kind delete cluster --name "${CLUSTER_NAME}" when
the cluster is present, mirroring the guard logic used in setup.sh so the script
doesn't exit non-zero under set -euo pipefail when the cluster was never
created.
---
Nitpick comments:
In `@kind/run-tests.sh`:
- Line 8: Replace the hardcoded DVO_METRICS_URL assignment in kind/run-tests.sh
with an environment-overridable default (use the existing pattern used for
CLUSTER_NAME/TEST_IMAGE/HTTPD_IMAGE so DVO_METRICS_URL can be overridden when
needed) and update any references to use that variable (look for DVO_METRICS_URL
in the script). Also add a row to the README env-var table documenting
DVO_METRICS_URL with default "http://localhost:8383" and a short description
like "Metrics endpoint for the test suite".
In `@kind/setup.sh`:
- Around line 79-91: The sed-based substitution for value: "2m" is brittle;
remove the sed pattern and instead update the Python YAML transform to find the
container env list in the document and change the env var named
VALIDATION_CHECK_INTERVAL from "2m" to "10s" (e.g., locate
doc['spec']['template']['spec'] → containers → env and mutate the entry where
env['name']=="VALIDATION_CHECK_INTERVAL"). Also keep the other edits (image and
pullPolicy) in sed, but delete the unanchored s|value: "2m"|...|; ensure the
Python block writes the modified YAML to stdout and preserves existing
spec.pop(...) removals (nodeSelector, tolerations, affinity).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Enterprise
Run ID: 07ffe223-4fea-4159-9362-42c529f05370
📒 Files selected for processing (5)
kind/README.mdkind/config.yamlkind/run-tests.shkind/setup.shkind/teardown.sh
| KUBECONFIG_TMP="$(mktemp)" | ||
| kind get kubeconfig --name "${CLUSTER_NAME}" > "${KUBECONFIG_TMP}" | ||
| chmod 644 "${KUBECONFIG_TMP}" | ||
|
|
||
| echo "==> Running tests from image: ${TEST_IMAGE}" | ||
| echo " Metrics: ${DVO_METRICS_URL}" | ||
|
|
||
| ${CONTAINER_ENGINE} run --rm \ | ||
| --network host \ | ||
| -v "${KUBECONFIG_TMP}:/tmp/kubeconfig:Z" \ | ||
| -e KUBECONFIG=/tmp/kubeconfig \ | ||
| -e DVO_METRICS_URL="${DVO_METRICS_URL}" \ | ||
| -e HTTPD_TEST_IMAGE="${HTTPD_IMAGE}" \ | ||
| "${TEST_IMAGE}" \ | ||
| pytest.sh "${PYTEST_ARGS[@]}" | ||
|
|
||
| rm -f "${KUBECONFIG_TMP}" |
There was a problem hiding this comment.
Temp kubeconfig is leaked when any subsequent command fails.
mktemp creates the file at line 53, but set -euo pipefail means the rm -f at line 69 is never reached if kind get kubeconfig, the chmod, or the ${CONTAINER_ENGINE} run exits non-zero. The file persists on disk indefinitely. Use trap to guarantee cleanup on any exit path.
🛡️ Proposed fix
KUBECONFIG_TMP="$(mktemp)"
+trap 'rm -f "${KUBECONFIG_TMP}"' EXIT
kind get kubeconfig --name "${CLUSTER_NAME}" > "${KUBECONFIG_TMP}"
chmod 644 "${KUBECONFIG_TMP}"
# ... rest of the script ...
-rm -f "${KUBECONFIG_TMP}"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| KUBECONFIG_TMP="$(mktemp)" | |
| kind get kubeconfig --name "${CLUSTER_NAME}" > "${KUBECONFIG_TMP}" | |
| chmod 644 "${KUBECONFIG_TMP}" | |
| echo "==> Running tests from image: ${TEST_IMAGE}" | |
| echo " Metrics: ${DVO_METRICS_URL}" | |
| ${CONTAINER_ENGINE} run --rm \ | |
| --network host \ | |
| -v "${KUBECONFIG_TMP}:/tmp/kubeconfig:Z" \ | |
| -e KUBECONFIG=/tmp/kubeconfig \ | |
| -e DVO_METRICS_URL="${DVO_METRICS_URL}" \ | |
| -e HTTPD_TEST_IMAGE="${HTTPD_IMAGE}" \ | |
| "${TEST_IMAGE}" \ | |
| pytest.sh "${PYTEST_ARGS[@]}" | |
| rm -f "${KUBECONFIG_TMP}" | |
| KUBECONFIG_TMP="$(mktemp)" | |
| trap 'rm -f "${KUBECONFIG_TMP}"' EXIT | |
| kind get kubeconfig --name "${CLUSTER_NAME}" > "${KUBECONFIG_TMP}" | |
| chmod 644 "${KUBECONFIG_TMP}" | |
| echo "==> Running tests from image: ${TEST_IMAGE}" | |
| echo " Metrics: ${DVO_METRICS_URL}" | |
| ${CONTAINER_ENGINE} run --rm \ | |
| --network host \ | |
| -v "${KUBECONFIG_TMP}:/tmp/kubeconfig:Z" \ | |
| -e KUBECONFIG=/tmp/kubeconfig \ | |
| -e DVO_METRICS_URL="${DVO_METRICS_URL}" \ | |
| -e HTTPD_TEST_IMAGE="${HTTPD_IMAGE}" \ | |
| "${TEST_IMAGE}" \ | |
| pytest.sh "${PYTEST_ARGS[@]}" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@kind/run-tests.sh` around lines 53 - 69, The temporary kubeconfig created by
KUBECONFIG_TMP via mktemp can leak if any command (kind get kubeconfig, chmod,
or the ${CONTAINER_ENGINE} run) fails; add a trap to ensure cleanup on any exit.
Specifically, define KUBECONFIG_TMP="$(mktemp)", then immediately register a
trap like trap 'rm -f "${KUBECONFIG_TMP}"' EXIT (or a function that removes the
file) before calling kind get kubeconfig and before running ${CONTAINER_ENGINE}
run, so rm -f "${KUBECONFIG_TMP}" is executed on normal exit and on errors; keep
the existing final rm -f as a no-op fallback.
| echo "==> Deleting KinD cluster '${CLUSTER_NAME}'..." | ||
| kind delete cluster --name "${CLUSTER_NAME}" | ||
| echo "==> Done." |
There was a problem hiding this comment.
kind delete cluster fails when the cluster doesn't exist.
Unlike setup.sh, which guards deletion with an existence check, teardown.sh calls kind delete cluster unconditionally. With set -euo pipefail, this causes the script to exit non-zero if the cluster was never created (e.g., after a failed setup run), which is an annoying failure mode.
🛡️ Proposed fix — guard deletion like setup.sh does
-echo "==> Deleting KinD cluster '${CLUSTER_NAME}'..."
-kind delete cluster --name "${CLUSTER_NAME}"
+if kind get clusters 2>/dev/null | grep -q "^${CLUSTER_NAME}$"; then
+ echo "==> Deleting KinD cluster '${CLUSTER_NAME}'..."
+ kind delete cluster --name "${CLUSTER_NAME}"
+else
+ echo "==> No cluster '${CLUSTER_NAME}' found, nothing to do."
+fi📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| echo "==> Deleting KinD cluster '${CLUSTER_NAME}'..." | |
| kind delete cluster --name "${CLUSTER_NAME}" | |
| echo "==> Done." | |
| if kind get clusters 2>/dev/null | grep -q "^${CLUSTER_NAME}$"; then | |
| echo "==> Deleting KinD cluster '${CLUSTER_NAME}'..." | |
| kind delete cluster --name "${CLUSTER_NAME}" | |
| else | |
| echo "==> No cluster '${CLUSTER_NAME}' found, nothing to do." | |
| fi | |
| echo "==> Done." |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@kind/teardown.sh` around lines 6 - 8, The teardown script currently calls
kind delete cluster --name "${CLUSTER_NAME}" unconditionally which fails if the
cluster doesn't exist; modify teardown.sh to first check for the cluster's
existence (e.g., using kind get clusters or kind get cluster --name
"${CLUSTER_NAME}" and test its exit/status) and only run kind delete cluster
--name "${CLUSTER_NAME}" when the cluster is present, mirroring the guard logic
used in setup.sh so the script doesn't exit non-zero under set -euo pipefail
when the cluster was never created.
|
/ok-to-test |
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #676 +/- ##
=======================================
Coverage 46.16% 46.16%
=======================================
Files 22 22
Lines 1083 1083
=======================================
Hits 500 500
Misses 553 553
Partials 30 30 🚀 New features to boost your workflow:
|
Summary
Add scripts to create a KinD cluster, deploy the DVO operator, and run the
existing e2e test suite without requiring a full OpenShift cluster.
kind/config.yaml: KinD cluster config with NodePort mapping for metricskind/setup.sh: creates cluster, builds and loads images, deploys DVOkind/run-tests.sh: runs tests from the test container imagekind/teardown.sh: deletes the clusterkind/README.md: usage and configuration documentationUsage:
Dependencies
This MR requires the corresponding changes in:
How to test locally
The test container image (
quay.io/redhatqe/deployment-validation-operator-tests:latest)does not include the dependency changes yet. To test end-to-end, build a local
test image that overlays the unreleased changes.
1. Clone the repos side by side
2. Build a local test image
A
kind/Dockerfile.testsis provided for this purpose:Build from the parent directory so all repos are in the build context:
3. Run the setup and tests
cd deployment-validation-operator/ ./kind/setup.sh ./kind/run-tests.sh --image localhost/dvo-tests:dev4. Clean up
Assisted-by: Claude Opus 4.6
Summary by CodeRabbit
Documentation
Chores