Skip to content

Monitor cross-dc-relation (Feature?) #1808

Description

@marcusboden

When you have cross-site replication, you only notice juju issues when you try to switch the sites.
I had the situation where the cross-site-replication was fine but juju had issues with the cross-site relation. It only showed up when I tried to to the switchover. Which is usually a bit too late.

Steps to reproduce

  1. Deploy two postgresql
  2. setup cross site replication
  3. have a juju issue (e.g. user accidentally disabled)

Expected behavior

The charm realizes that it cannot reach the other juju controller anymore and warns the operator about the issue.

Actual behavior

Nothing happens, but if you try to switchover the sites, it will not work and the consumer site may go into error states. Once that happens, you also see the CMR being in failed state.

Versions

Operating system: Ubuntu 24.04.4

Juju CLI: 3.5.7

Juju agent: 3.6.12

Charm revision: 1158

LXD: 5.0.6-7fc3b36

Log output

controller-0: 08:12:58 INFO juju.worker.remoterelations cmr start "hdev-swarm-a-postgresql-replication-offer"                                                                                                                                                  
controller-0: 08:12:58 ERROR juju.worker.remoterelations cmr error in remote application worker for hdev-swarm-a-postgresql-replication-offer: watching status for offer: cannot get discharge from "https://10.33.236.139:17070/offeraccess": third party refu
sed discharge: cannot discharge: permission denied                                                                                                                                                                                                             
controller-0: 08:12:58 INFO juju.worker.remoterelations cmr stopped "hdev-swarm-a-postgresql-replication-offer", err: watching status for offer: cannot get discharge from "https://10.33.236.139:17070/offeraccess": third party refused discharge: cannot dis
charge: permission denied                                                                                                                                                                                                                                      
controller-0: 08:12:58 INFO juju.worker.remoterelations cmr non-fatal error "hdev-swarm-a-postgresql-replication-offer": watching status for offer: cannot get discharge from "https://10.33.236.139:17070/offeraccess": third party refused discharge: cannot 
discharge: permission denied                                                                                                                                                                                                                                   
controller-0: 08:12:58 ERROR juju.worker.remoterelations cmr exited "hdev-swarm-a-postgresql-replication-offer": watching status for offer: cannot get discharge from "https://10.33.236.139:17070/offeraccess": third party refused discharge: cannot discharg
e: permission denied                                                                                                                                                                                                                                           
controller-0: 08:12:58 INFO juju.worker.remoterelations cmr restarting "hdev-swarm-a-postgresql-replication-offer" in 15s                                                                                                                                      

Digging in the controller logs reveals:

controller-1: 07:52:34 ERROR juju.worker.remoterelations cmr error in remote application worker for hdev-swarm-b-postgresql-replication-offer: watching status for offer: cannot get discharge from "https://10.33.236.76:17070/offeraccess": third party refused discharge: cannot discharge: permission denied

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature, UI change, or workload upgrade

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions