Skip to content

fix(relay-server): advertise configured external multiaddrs on / and /enr#428

Open
varex83 wants to merge 3 commits into
mainfrom
fix/relay-advertise-external-multiaddrs
Open

fix(relay-server): advertise configured external multiaddrs on / and /enr#428
varex83 wants to merge 3 commits into
mainfrom
fix/relay-advertise-external-multiaddrs

Conversation

@varex83
Copy link
Copy Markdown
Collaborator

@varex83 varex83 commented May 20, 2026

Summary

Fixes both / and /enr returning no useful data in private-network / K8s deployments where libp2p only sees private listen addresses and filter_private_addrs=true.

Root cause: AppState.addrs was populated exclusively from SwarmEvent::NewListenAddr. External addresses configured via --p2p-external-ip / --p2p-external-hostname reach the swarm through add_external_address(...) but never entered that Vec, so:

  • / returned [] once libp2p's listen addrs were filtered out as private.
  • /enr short-circuited with 500 "no addresses" before apply_ip_override ran, making external_ip and the DNS resolver dead code in this deployment shape.

Fix: compute external_tcp_multiaddrs + external_udp_multiaddrs from config at relay startup, thread them through enr_server into AppState as an immutable Vec<Multiaddr>, and union them with the live listeners (externals first, deduped) when serving both endpoints. Mirrors Go charon's AddrsFactory + filterAdvertisedAddrs shape.

For /enr specifically, the TCP/UDP scan loop now has a DNS fallback: /dns/<host>/tcp/<port> and /dns/<host>/udp/<port>/quic-v1 multiaddrs are resolved via the cached external_host_ip produced by the existing resolver loop. The /ip4/... + apply_ip_override path is unchanged, so external_ip keeps priority over external_host when both are set.

Files

  • crates/p2p/src/utils.rsexternal_tcp_multiaddrs / external_udp_multiaddrs visibility bumped from pub(crate) to pub.
  • crates/relay-server/src/utils.rs — new extract_dns_and_{tcp,udp}_port helpers.
  • crates/relay-server/src/web.rsAppState gains external_addrs + advertised_addrs(); both handlers use the union.
  • crates/relay-server/src/p2p.rs — compute external_addrs and pass to enr_server.

Verification

Ran locally with the same env shape as the affected K8s deployment:

PLUTO_HTTP_ADDRESS=127.0.0.1:43640 \
PLUTO_P2P_ADVERTISE_PRIVATE_ADDRESSES=false \
PLUTO_P2P_EXTERNAL_HOSTNAME=example.com \
PLUTO_P2P_TCP_ADDRESS=127.0.0.1:43610 \
PLUTO_P2P_UDP_ADDRESS=127.0.0.1:43610 \
PLUTO_AUTO_P2PKEY=true \
./target/debug/pluto relay

/:

[
  "/dns/example.com/tcp/43610/p2p/16Uiu2HAm...",
  "/dns/example.com/udp/43610/quic-v1/p2p/16Uiu2HAm..."
]

/enr returned 200 OK with a valid ENR containing example.com's resolved IPv4 address.

varex83 added 3 commits May 20, 2026 22:35
…/enr

The /multiaddr and /enr handlers read from AppState.addrs, which was
populated exclusively by SwarmEvent::NewListenAddr. Configured external
addresses (from --p2p-external-ip / --p2p-external-hostname) are pushed
into the libp2p swarm via add_external_address but never appeared in
that Vec, so:

  - /     returned [] in private-network deployments (filter_private_addrs
          drops the libp2p listen addrs; no externals to fall back to).
  - /enr  short-circuited with 500 "no addresses" before apply_ip_override
          could run, making both external_ip and the DNS resolver dead
          code in K8s-style deployments.

Compute external_tcp_multiaddrs + external_udp_multiaddrs once at relay
startup, thread them into AppState as an immutable Vec, and union them
with the live listeners (externals first, deduped) when serving both
endpoints. Mirrors Go charon's AddrsFactory + filterAdvertisedAddrs
shape.

For /enr, extend the TCP/UDP scan loop with a DNS fallback: if the
candidate multiaddr is /dns/<host>/{tcp,udp}/<port>, substitute the
resolver-cached external_host IP. Existing /ip4/... + apply_ip_override
path is unchanged, so external_ip continues to win over external_host
when both are set.

Verified locally with PLUTO_P2P_EXTERNAL_HOSTNAME=example.com,
PLUTO_P2P_ADVERTISE_PRIVATE_ADDRESSES=false, loopback listen addrs:
/ returns /dns/example.com/{tcp,udp}/... and /enr returns a valid ENR
with example.com's resolved IP embedded.
Adds 33 unit tests across the new code paths:

- utils: extract_dns_and_{tcp,udp}_port positive/negative cases for
  Dns/Dns4/Dns6, plus regression coverage for the IPv4 extractors and
  is_public_addr.
- web::AppState::advertised_addrs: union/dedup ordering with externals
  first, listener-vs-external duplicates collapsed, empty-state.
- web::multiaddr_handler: returns externals first with /p2p/<peer-id>
  encapsulated; empty when nothing configured; external_ip and
  external_host cases.
- web::enr_handler: 500 when nothing configured, external_ip baked into
  ENR (with and without conflicting listener IP), external_host DNS
  fallback uses resolver-cached IP, external_ip wins over external_host
  when both are set, public listener used when no externals.

Adds #[derive(Debug)] on HandlerError so tests can use `expect_err`.
Spins up the real `enr_server` axum app on an ephemeral 127.0.0.1 port
and exercises both routes over a live TCP socket via reqwest. Three
scenarios:

- external_ip only: asserts `/` returns /ip4/<ip>/{tcp,udp,quic-v1}
  multiaddrs with peer-id encapsulation and that /enr returns a valid
  ENR with the external IP baked in.
- empty config: asserts `/` returns [] and /enr returns 500.
- external_host=localhost: asserts `/` emits /dns/localhost/...
  multiaddrs verbatim and polls /enr until the resolver populates the
  cache, then asserts the ENR contains a loopback IP.

The localhost scenario relies on /etc/hosts resolution rather than
public DNS, so the suite is hermetic in CI. Each test cancels the
server via CancellationToken and waits with a bounded timeout to keep
flaky runs visible instead of hanging.

To make `enr_server` callable from a `tests/` crate, re-export it from
the crate root as a `#[doc(hidden)]` item. Adds `reqwest` and
`serde_json` to [dev-dependencies].

The existing `.github/workflows/test.yml` runs
`cargo test --locked --workspace --all-features`, which picks up the
new tests on both linux/amd64 and linux/arm64 runners. `linter.yml`
runs clippy with --all-targets, which covers the test code too. No CI
workflow changes are required.
/// addresses at ingest time when `filter_private_addrs` is set.
async fn advertised_addrs(&self) -> Vec<Multiaddr> {
let listeners = self.addrs.read().await;
let mut union: Vec<Multiaddr> = self.external_addrs.clone();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

external_addrs can have duplicates. Better use HashSet here to simplify the code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants