When you have cross-site replication, you only notice juju issues when you try to switch the sites.
I had the situation where the cross-site-replication was fine but juju had issues with the cross-site relation. It only showed up when I tried to to the switchover. Which is usually a bit too late.
Steps to reproduce
- Deploy two postgresql
- setup cross site replication
- have a juju issue (e.g. user accidentally disabled)
Expected behavior
The charm realizes that it cannot reach the other juju controller anymore and warns the operator about the issue.
Actual behavior
Nothing happens, but if you try to switchover the sites, it will not work and the consumer site may go into error states. Once that happens, you also see the CMR being in failed state.
Versions
Operating system: Ubuntu 24.04.4
Juju CLI: 3.5.7
Juju agent: 3.6.12
Charm revision: 1158
LXD: 5.0.6-7fc3b36
Log output
controller-0: 08:12:58 INFO juju.worker.remoterelations cmr start "hdev-swarm-a-postgresql-replication-offer"
controller-0: 08:12:58 ERROR juju.worker.remoterelations cmr error in remote application worker for hdev-swarm-a-postgresql-replication-offer: watching status for offer: cannot get discharge from "https://10.33.236.139:17070/offeraccess": third party refu
sed discharge: cannot discharge: permission denied
controller-0: 08:12:58 INFO juju.worker.remoterelations cmr stopped "hdev-swarm-a-postgresql-replication-offer", err: watching status for offer: cannot get discharge from "https://10.33.236.139:17070/offeraccess": third party refused discharge: cannot dis
charge: permission denied
controller-0: 08:12:58 INFO juju.worker.remoterelations cmr non-fatal error "hdev-swarm-a-postgresql-replication-offer": watching status for offer: cannot get discharge from "https://10.33.236.139:17070/offeraccess": third party refused discharge: cannot
discharge: permission denied
controller-0: 08:12:58 ERROR juju.worker.remoterelations cmr exited "hdev-swarm-a-postgresql-replication-offer": watching status for offer: cannot get discharge from "https://10.33.236.139:17070/offeraccess": third party refused discharge: cannot discharg
e: permission denied
controller-0: 08:12:58 INFO juju.worker.remoterelations cmr restarting "hdev-swarm-a-postgresql-replication-offer" in 15s
Digging in the controller logs reveals:
controller-1: 07:52:34 ERROR juju.worker.remoterelations cmr error in remote application worker for hdev-swarm-b-postgresql-replication-offer: watching status for offer: cannot get discharge from "https://10.33.236.76:17070/offeraccess": third party refused discharge: cannot discharge: permission denied
Additional context
When you have cross-site replication, you only notice juju issues when you try to switch the sites.
I had the situation where the cross-site-replication was fine but juju had issues with the cross-site relation. It only showed up when I tried to to the switchover. Which is usually a bit too late.
Steps to reproduce
Expected behavior
The charm realizes that it cannot reach the other juju controller anymore and warns the operator about the issue.
Actual behavior
Nothing happens, but if you try to switchover the sites, it will not work and the consumer site may go into error states. Once that happens, you also see the CMR being in failed state.
Versions
Operating system: Ubuntu 24.04.4
Juju CLI: 3.5.7
Juju agent: 3.6.12
Charm revision: 1158
LXD: 5.0.6-7fc3b36
Log output
Digging in the controller logs reveals:
Additional context