Skip to content

[CRDB-53973] e2e/operator: Advanced e2e tests for bare-metal infrastructure(kind).#577

Draft
NishanthNalluri wants to merge 2 commits intomasterfrom
nishanth/operator-advanced-tests
Draft

[CRDB-53973] e2e/operator: Advanced e2e tests for bare-metal infrastructure(kind).#577
NishanthNalluri wants to merge 2 commits intomasterfrom
nishanth/operator-advanced-tests

Conversation

@NishanthNalluri
Copy link
Copy Markdown
Contributor

@NishanthNalluri NishanthNalluri commented Nov 20, 2025

This commit introduces advanced e2e tests for single and multi region for CockroachDB features like Wal-failover, EAR and PCR. These tests verifies supportibility when configured through helm-charts rather than re-validating functionality.

These advanced tests are run weekly once similar to the existing weekly tests and are executed independently without mixing with basic tests.

Implements CRDB-53973

@NishanthNalluri NishanthNalluri force-pushed the nishanth/operator-advanced-tests branch from d819594 to 41f6e9b Compare November 20, 2025 13:50
@NishanthNalluri NishanthNalluri changed the title e2e/operator: Advanced e2e tests for bare-metal infrastructure(kind). [CRDB-53973] e2e/operator: Advanced e2e tests for bare-metal infrastructure(kind). Nov 20, 2025
This commit introduces advanced e2e tests for single and multi region for CockroachDB features
like Wal-failover, EAR and PCR. These tests verifies supportibility when configured through helm-charts
rather than re-validating functionality.

These advanced tests are run weekly once similar to the existing weekly tests and are executed independently
without mixing with basic tests.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new “advanced features” operator e2e suite (gated by TEST_ADVANCED_FEATURES) to validate WAL failover, Encryption-at-Rest, and PCR when configured via Helm, and wires it into a scheduled weekly GitHub Actions workflow.

Changes:

  • Add single-region and multi-region advanced feature e2e tests (WAL failover, EAR, PCR).
  • Extend tests/e2e/operator/region.go with shared advanced install/validation helpers.
  • Add a scheduled GitHub Actions workflow and Makefile target to run the advanced suite.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tests/e2e/operator/singleRegion/cockroachdb_single_region_e2e_test.go Gates basic vs advanced single-region suites via TEST_ADVANCED_FEATURES.
tests/e2e/operator/singleRegion/cockroachdb_single_region_advanced_features_test.go New advanced single-region tests for WAL failover/EAR/PCR.
tests/e2e/operator/region.go Adds advanced install/validation helpers (WAL failover, EAR, PCR, namespace helpers).
tests/e2e/operator/multiRegion/cockroachdb_multi_region_e2e_test.go Gates basic vs advanced multi-region suites via TEST_ADVANCED_FEATURES.
tests/e2e/operator/multiRegion/cockroachdb_multi_region_advanced_features_test.go New advanced multi-region tests for WAL failover/EAR/PCR.
Makefile Adds test/nightly-e2e/advanced target to run advanced suite.
.github/workflows/integration-tests-advanced.yaml Adds weekly scheduled workflow to run advanced tests (k3d/kind).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +936 to +958
func (r *Region) GenerateEncryptionKey(t *testing.T) string {
// Generate 256-bit AES key using cockroach gen encryption-key
cmd := shell.Command{
Command: "cockroach",
Args: []string{"gen", "encryption-key", "--size", "256", "store.key"},
WorkingDir: ".",
}

_, err := shell.RunCommandAndGetOutputE(t, cmd)
require.NoError(t, err)

// Read the generated key file
keyBytes, err := os.ReadFile("store.key")
require.NoError(t, err)

// Base64 encode the key (removing any newlines)
storeKeyB64 := base64.StdEncoding.EncodeToString(keyBytes)
storeKeyB64 = strings.ReplaceAll(storeKeyB64, "\n", "")

// Clean up the key file
os.Remove("store.key")

return storeKeyB64
Comment on lines +1268 to +1286
readFromStandby, err := execSQLOnVC(t, standbyKubectlOptions, standbyPod, readerURI, "bank", "SELECT id, balance FROM accounts ORDER BY id")
if err != nil {
t.Logf("Warning: Read from standby reader failed (may need more time): %v", err)
// Try reading directly from main VC as fallback
t.Log("Trying to read from main virtual cluster instead...")
standbyMainURI := generateLocalTenantURI("main")
readFromStandby, err = execSQLOnVC(t, standbyKubectlOptions, standbyPod, standbyMainURI, "bank", "SELECT id, balance FROM accounts ORDER BY id")
if err != nil {
t.Logf("Warning: Read from main VC also failed: %v", err)
}
}

if err == nil {
require.Contains(t, readFromStandby, "1000", "Should be able to read replicated data from standby")
require.Contains(t, readFromStandby, "250", "Should be able to read replicated data from standby")
t.Log("✓ Successfully read replicated data from standby reader!")
t.Logf("Data from standby reader:\n%s", readFromStandby)
}

Comment on lines +140 to +142
test/nightly-e2e/advanced: bin/cockroach bin/kubectl bin/helm build/self-signer bin/k3d bin/kind
@PATH="$(PWD)/bin:${PATH}" TEST_ADVANCED_FEATURES=true go test -timeout 90m -v -test.run TestOperatorInSingleRegion ./tests/e2e/operator/singleRegion/... || (echo "Advanced features tests failed with exit code $$?" && exit 1)

Comment on lines +108 to +131
t.Logf("Installing region 0 (%s) with encryption at rest enabled", cluster0)
config0 := operator.AdvancedInstallConfig{
EncryptionEnabled: true,
EncryptionKeySecret: encryptionKeyB64,
CustomRegions: encryptionRegions0,
CustomValues: map[string]string{
// Override secret name to use custom name
"cockroachdb.crdbCluster.encryptionKeySecretName": secretName0,
},
}
r.InstallChartsWithAdvancedConfig(t, cluster0, 0, config0)

// Manually create the secret with custom name in region 0
kubeConfig, _ := r.GetCurrentContext(t)
kubectlOptions0 := k8s.NewKubectlOptions(cluster0, kubeConfig, r.Namespace[cluster0])

// Delete the default secret if it exists
_ = k8s.RunKubectlE(t, kubectlOptions0, "delete", "secret", "cmek-key-secret", "--ignore-not-found")

// Create the custom named secret
err := k8s.RunKubectlE(t, kubectlOptions0, "create", "secret", "generic", secretName0,
fmt.Sprintf("--from-literal=StoreKeyData=%s", encryptionKeyB64))
require.NoError(t, err)
t.Logf("Created encryption secret %s in region 0", secretName0)
Comment on lines +334 to +335
k8s.RunKubectl(t, kubectlOptions, "create", "secret", "generic", "cmek-key-secret-new",
"--from-literal=StoreKeyData="+newEncryptionKeyB64)
Comment on lines +796 to +810
// Create encryption key secret if encryption is enabled
if config.EncryptionEnabled && config.EncryptionKeySecret != "" {
// Create secret with base64 encoded AES key
err = k8s.RunKubectlE(t, kubectlOptions, "create", "secret", "generic", "cmek-key-secret",
fmt.Sprintf("--from-literal=StoreKeyData=%s", config.EncryptionKeySecret))
require.NoError(t, err)

// Verify secret was created with data
secretSize, err := k8s.RunKubectlAndGetOutputE(t, kubectlOptions,
"get", "secret", "cmek-key-secret",
"-o", "jsonpath={.data.StoreKeyData}")
require.NoError(t, err)
require.True(t, len(secretSize) > 0, "Secret StoreKeyData should be >0")
t.Logf("Created encryption secret with size: %d bytes", len(secretSize))
}
Comment on lines +900 to +902
if err == nil && walFailoverEnv != "" {
t.Logf("COCKROACH_WAL_FAILOVER environment variable is set to: %s", walFailoverEnv)
}
nameisbhaskar added a commit that referenced this pull request Mar 27, 2026
This commit adds comprehensive AWS/EKS infrastructure support and advanced
e2e testing capabilities for CockroachDB operator, enabling both basic and
advanced feature testing across multiple cloud providers.

## AWS Infrastructure Support

- **EKS Cluster Provisioning**: Full support for creating EKS clusters with
  eksctl, including automatic EBS CSI driver installation for EKS 1.23+
- **Multi-Region Support**: VPC peering for cross-region connectivity with
  proper security group ordering
- **Network Configuration**: Support for 3 regions (us-east-1, us-east-2,
  us-west-2) with non-overlapping CIDR blocks
- **Corporate Proxy & TLS**: Handle corporate TLS inspection proxies with
  optional TLS verification bypass via KUBECTL_INSECURE_SKIP_TLS_VERIFY
- **Resource Cleanup**: Comprehensive cleanup script with TestRunID-based
  tagging for concurrent test isolation

## Advanced Features Testing

Cherry-picked from PR #577 and adapted for AWS support:

- **WAL Failover Tests**: Single and multi-region tests for WAL failover
  functionality with enable/disable scenarios
- **Encryption at Rest (EAR)**: Tests for enabling, disabling, and modifying
  encryption at rest configurations
- **Protected Cluster Replication (PCR)**: Multi-region PCR testing
- **Test Isolation**: Environment variable TEST_ADVANCED_FEATURES separates
  basic and advanced test execution
- **AWS Provider Support**: Advanced tests work out-of-the-box with AWS
  infrastructure

## Code Improvements

- **Removed IsMultiRegion flag**: Replaced redundant boolean with actual
  cluster count checks (len(r.Clusters) > 1)
- **Safer conditional logic**: Changed early returns to conditional blocks
  for future-proof code execution
- **Cleaner architecture**: Single source of truth (cluster count) instead
  of maintaining separate boolean state

## CI/CD Updates

- **Advanced Tests Workflow**: New GitHub Actions workflow for weekly
  advanced feature testing
- **Multi-Provider Support**: Supports k3d, kind, and AWS providers
- **AWS Setup**: Automated AWS credentials configuration and eksctl
  installation for AWS tests
- **Makefile Target**: New test/nightly-e2e/advanced target for running
  advanced test suite

## Files Changed

- tests/e2e/operator/infra/aws.go (new) - AWS infrastructure provisioning
- tests/e2e/operator/infra/cleanup-aws-resources.sh (new) - Resource cleanup
- tests/e2e/operator/singleRegion/cockroachdb_single_region_advanced_features_test.go (new)
- tests/e2e/operator/multiRegion/cockroachdb_multi_region_advanced_features_test.go (new)
- tests/e2e/operator/infra/local.go - Removed IsMultiRegion, improved BGP peering logic
- tests/e2e/operator/region.go - Removed IsMultiRegion field
- .github/workflows/integration-tests-advanced.yaml (new) - Advanced tests CI
- Makefile - Added test/nightly-e2e/advanced target

Resolves: https://cockroachlabs.atlassian.net/browse/CRDB-53967

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@shreyaskm623 shreyaskm623 force-pushed the nishanth/operator-advanced-tests branch 2 times, most recently from 434f901 to 25ce183 Compare March 31, 2026 12:25
@shreyaskm623 shreyaskm623 force-pushed the nishanth/operator-advanced-tests branch from 25ce183 to c8932aa Compare March 31, 2026 13:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants