Expected Behavior
In case a custom module A is down (stopped, failed, hung, idle, whatever) and another custom module B sends messages to it, messages should be deleted from the edgeHub storage to prevent a full disk.
Current Behavior
Even if the TTL is already expired, messages stay in the edgeHub storage.
Steps to Reproduce
Provide a detailed set of steps to reproduce the bug.
- having one module publishing a lot of messages to another module
- let the other module crash always a few seconds after startup (after 20 times of retry, edgeAgent doesn't restart the module anymore)
- check the size of the edgeHub storage directory
Context (Environment)
Output of iotedge check
Click here
Device Information
- Host OS [e.g. Ubuntu 22.04, Windows Server IoT 2019]: Ubuntu 22.04 and Debian 12
- Architecture [e.g. amd64, arm32, arm64]: amd64 and arm64
- Container OS [e.g. Linux containers, Windows containers]: Linux containers
Runtime Versions
- aziot-edged [run
iotedge version]: 1.5
- Edge Agent [image tag (e.g. 1.0.0)]: 1.5.19
- Edge Hub [image tag (e.g. 1.0.0)]: 1.5.19
- Docker/Moby [run
docker version]: 20.10.11+azure-2
Logs
there are already > 1 Mio messages in the edgeHub Rocks DB
edgehub_messages_sent_total{iothub="prodIotHubNtuity.azure-devices.net",edge_device="a17f9e19-1f4a-4168-8f56-2badb1a88646",instance_number="1b3417d6-ded1-4752-a57c-16765ff477d6",from="a17f9e19-1f4a-4168-8f56-2badb1a88646/Modbus",to="a17f9e19-1f4a-4168-8f56-2badb1a88646/ProtocolAbstraction",from_route_output="samples",to_route_input="samples",priority="2000000000",ms_telemetry="True"}
1040373
and this is an example message from the directory /var/lib/aziot/storage/edgeHub/ which should have been deleted already because it’s expired → according to Copilot they are not delete if RefCount is still > 0
example message which should have been deleted already because it’s expired → according to Copilot they are not delete if RefCount is still > 0
Additional Information
I think Copilot already found the root cause, see here my investigation:

and the problematic code is maybe because the messages are not deleted before RefCount becomes 0:
https://github.com/Azure/iotedge/blob/main/edge-hub/core/src/Microsoft.Azure.Devices.Edge.Hub.Core/storage/MessageStore.cs#L324
Expected Behavior
In case a custom module A is down (stopped, failed, hung, idle, whatever) and another custom module B sends messages to it, messages should be deleted from the edgeHub storage to prevent a full disk.
Current Behavior
Even if the TTL is already expired, messages stay in the edgeHub storage.
Steps to Reproduce
Provide a detailed set of steps to reproduce the bug.
Context (Environment)
Output of
iotedge checkClick here
Device Information
Runtime Versions
iotedge version]: 1.5docker version]: 20.10.11+azure-2Logs
there are already > 1 Mio messages in the edgeHub Rocks DB
and this is an example message from the directory
/var/lib/aziot/storage/edgeHub/which should have been deleted already because it’s expired → according to Copilot they are not delete if RefCount is still > 0Additional Information
I think Copilot already found the root cause, see here my investigation:

and the problematic code is maybe because the messages are not deleted before RefCount becomes 0:
https://github.com/Azure/iotedge/blob/main/edge-hub/core/src/Microsoft.Azure.Devices.Edge.Hub.Core/storage/MessageStore.cs#L324