Skip to content

feat(infra): add multi-region deployment and failover strategy (#620)#695

Open
Nworah-Gabriel wants to merge 2 commits into
rinafcode:mainfrom
Nworah-Gabriel:main
Open

feat(infra): add multi-region deployment and failover strategy (#620)#695
Nworah-Gabriel wants to merge 2 commits into
rinafcode:mainfrom
Nworah-Gabriel:main

Conversation

@Nworah-Gabriel
Copy link
Copy Markdown

Implement an active/warm-standby two-region topology by composing the existing single-region Terraform modules with aliased providers.

  • tf/multi-region: full stack (networking, compute, storage, monitoring) in primary + secondary regions
  • modules/dns-failover: Route 53 health checks + active-passive failover
  • modules/replication: S3 cross-region replication (uploads, backups)
  • modules/database-replica: cross-region RDS read replica + standby Redis
  • infra/scripts: failover.sh (promote/scale/failback), failover-drill.sh, validate-multiregion.sh
  • dr/: data-replication strategy + multi-region deployment runbook
  • add db_instance_arn output; required_providers in each module

Validated with terraform fmt + validate (single- and multi-region).

Closes #620

…code#620)

Implement an active/warm-standby two-region topology by composing the
existing single-region Terraform modules with aliased providers.

- tf/multi-region: full stack (networking, compute, storage, monitoring)
  in primary + secondary regions
- modules/dns-failover: Route 53 health checks + active-passive failover
- modules/replication: S3 cross-region replication (uploads, backups)
- modules/database-replica: cross-region RDS read replica + standby Redis
- infra/scripts: failover.sh (promote/scale/failback), failover-drill.sh,
  validate-multiregion.sh
- dr/: data-replication strategy + multi-region deployment runbook
- add db_instance_arn output; required_providers in each module

Validated with terraform fmt + validate (single- and multi-region).

Closes rinafcode#620
@drips-wave
Copy link
Copy Markdown

drips-wave Bot commented May 29, 2026

@Nworah-Gabriel Great news! 🎉 Based on an automated assessment of this PR, the linked Wave issue(s) no longer count against your application limits.

You can now already apply to more issues while waiting for a review of this PR. Keep up the great work! 🚀

Learn more about application limits

rinafcode#615)

Implement reliable outbound webhook delivery on the existing webhooks worker:

- webhook-backoff.util: exponential backoff with equal jitter + max cap, and
  retryability classification (5xx/408/425/429/transport = retry; 4xx = permanent)
- webhook-delivery.service: HMAC-signed HTTP POST; retryable failures throw so
  Bull re-enqueues with backoff (retry queue); permanent/exhausted -> dead-letter
- webhook-monitor.service: failure counters + CustomMetrics/alert + events
- webhook-retry.config: env-overridable maxRetries/delays/jitter/timeout
- wire WebhooksWorker to the delivery service (optional-dep fallback for the
  orchestration pool); register WebhooksDeliveryModule

Tests: 19 unit tests (backoff math, retry vs dead-letter, HMAC, job options).

Closes rinafcode#615
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add multi-region deployment and failover strategy

1 participant