Skip to content

Clean leftover gateway ext-ports before func tests#410

Merged
dosaboy merged 1 commit into
canonical:mainfrom
xtrusia:clean-orphan-ext-ports
Jun 26, 2026
Merged

Clean leftover gateway ext-ports before func tests#410
dosaboy merged 1 commit into
canonical:mainfrom
xtrusia:clean-orphan-ext-ports

Conversation

@xtrusia

@xtrusia xtrusia commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

zaza creates an undercloud neutron port _ext-port and attaches it to each gateway instance with interface_attach.
nova does not delete such a user-supplied port on teardown - it only detaches it and leaves it behind.
juju usually cleans it up - its openstack provider deletes ports filtered by device_id == server
but if nova clears device_id before that query runs, or the port delete errors out, or teardown is forced, the detached port survives on the shared external network.
A later test whose floating IP lands on that leaked address then collides with the dead port at L2/ARP and can't reach its guest, so the failure shows up intermittently.
Before each model is built, delete ext-ports that are detached or whose device_id points at a gone instance; a port still attached to a live server is left alone.

This does not fully fix the leak. The teardown is async, so it can miss the port from the last model.
But it cleans old leaked ports before each build, so the collision is much less likely.

@dosaboy dosaboy left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit but otherwise ltgm

Comment thread openstack/tools/clean_orphan_dataports.sh Outdated
zaza creates an undercloud neutron port <server>_ext-port and attaches it to each gateway instance with interface_attach.
nova does not delete such a user-supplied port on teardown - it only detaches it and leaves it behind.
juju usually cleans it up - its openstack provider deletes ports filtered by device_id == server - but if nova clears device_id before that query runs, or the port delete errors out, or teardown is forced, the detached port survives on the shared external network.
A later test whose floating IP lands on that leaked address then collides with the dead port at L2/ARP and can't reach its guest, so the failure shows up intermittently.
Before each model is built, delete ext-ports that are detached or whose device_id points at a gone instance; a port still attached to a live server is left alone.

Signed-off-by: Seyeong Kim <seyeong.kim@canonical.com>
@xtrusia xtrusia force-pushed the clean-orphan-ext-ports branch from e643dfd to 695ee8a Compare June 25, 2026 09:00

@pponnuvel pponnuvel left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dosaboy dosaboy added this pull request to the merge queue Jun 26, 2026
Merged via the queue into canonical:main with commit 7c6c753 Jun 26, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants