Skip to content

Track deleted runtime objects to avoid erroneous recreation#10830

Open
jschmidt-icinga wants to merge 4 commits intomasterfrom
remember-deleted-api-objects
Open

Track deleted runtime objects to avoid erroneous recreation#10830
jschmidt-icinga wants to merge 4 commits intomasterfrom
remember-deleted-api-objects

Conversation

@jschmidt-icinga
Copy link
Copy Markdown
Contributor

@jschmidt-icinga jschmidt-icinga commented Apr 29, 2026

Description

This fixes deleted objects being recreated and the cluster state becoming inconsistent after deleting objects via the API on one endpoint, while the second endpoint was offline or not reachable.

This is done by keeping track of deleted objects in a map of the deleted objects name to the object's "version". If the object would be recreated with a version that is older or equal to the version it has been deleted with, the object creation is being skipped. This is done in the expectation that at the same time the deletion event is being replayed to the other endpoint that is trying to sync the object creation.

Testing

For now I have built an integration test in my own dev-docker environment based on the steps described in #10022. Additionally the test also verifies that afterwards creating a new object with the same name works fine. I've not yet tested various split-brain scenarios @julianbrost and I discussed offline.

Fixes #10022.

@cla-bot cla-bot Bot added the cla/signed label Apr 29, 2026
@jschmidt-icinga jschmidt-icinga added the bug Something isn't working label Apr 29, 2026
@jschmidt-icinga jschmidt-icinga changed the title (WIP) Track deleted runtime objects to avaóid erroneous recreation (WIP) Track deleted runtime objects to avoid erroneous recreation Apr 30, 2026
@jschmidt-icinga jschmidt-icinga force-pushed the remember-deleted-api-objects branch from 94de327 to 34ce649 Compare May 4, 2026 12:54
Comment thread lib/icinga/icingaapplication.cpp Outdated
Comment thread lib/remote/apilistener-configsync.cpp Outdated
Comment thread lib/remote/apilistener-configsync.cpp
Comment thread lib/remote/apilistener-configsync.cpp Outdated
Comment thread lib/remote/apilistener-configsync.cpp Outdated
@jschmidt-icinga jschmidt-icinga force-pushed the remember-deleted-api-objects branch from 34ce649 to 7f60c60 Compare May 5, 2026 14:40
@jschmidt-icinga jschmidt-icinga changed the title (WIP) Track deleted runtime objects to avoid erroneous recreation Track deleted runtime objects to avoid erroneous recreation May 6, 2026
@jschmidt-icinga jschmidt-icinga marked this pull request as ready for review May 6, 2026 09:57
Comment thread lib/remote/apilistener-configsync.cpp Outdated
Comment thread lib/remote/apilistener-configsync.cpp Outdated
Comment on lines +461 to +462
auto deletionTime = Utility::GetTime();
GetDeletedRuntimeObjects()->Set(object->GetReflectionType()->GetName() + ":" + object->GetName(), deletionTime);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this happen on each node processing the deletion individually, i.e. each one would get a different value here? If that happens, different nodes could handle (accept/discard) the same following change to the same object differently. In particular, if the deletion is replayed, there could be significant discrepancies in die values.

So in my expectation, the node initially receiving the deletion request from the HTTP API should get the deletion time/version and then replicate it to the others. One option (though not sure if that's the best way, it's just my first idea) would be to only Set() here if it's not already set, and already insert the value from the JSON-RPC message in ConfigDeleteObjectAPIHandler.

Copy link
Copy Markdown
Contributor Author

@jschmidt-icinga jschmidt-icinga May 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that solution would have its own downsides. For example if we only add the entry in ApiListener::DeleteConfigObject if it doesn't already exist, then future HTTP-API deletes for an identically named object will not update the timestamp anymore.
I'll think about what to do with this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can use the cookie that is already passed through ConfigObjectUtility back into ApiListener to only update the timestamp if the request is not originating from a JSON-RPC message. It isn't pretty, but since everything happens in the same file, if I add a comment that explains why, it should be fine.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While thinking about this again, I notices that only PruneDeletedRuntimeObjects() removes from that map. Just also doing that when recreating an object might be a good idea in itself, but could also be used to solve this.

Copy link
Copy Markdown
Contributor Author

@jschmidt-icinga jschmidt-icinga May 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had that in the first iteration, but removed it because I was worried about the countless split-brain scenarios potentially leading to inconsistencies. Can we be sure that there will never any states between peers or even race-conditions between JSON-RPC and HTTP where this could fail?

Like a scenario where both masters both receive multiple creation and deletion API requests in (relatively) quick succession, that they will forward to each other.

I'll have to think if I can come up with a concrete scenario. But I'm pretty sure that if we prioritize non-RPC deletions with higher timestamps, that should always be safe.

Copy link
Copy Markdown
Contributor Author

@jschmidt-icinga jschmidt-icinga May 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is another problem that I think neither of our approaches solves: JSON-RPC unconditionally deletes objects with cascade = true (which is a whole new source of inconsistency on its own, in case the HTTP API originally deleted with cascade = false). If we guard insertion with a check on HTTP vs. JSON-RPC, no timestamps will be added for these child objects, and if we follow your suggestion the timestamps will be updated to the replay time while the parent stays on the deletion timestamp.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we be sure that there will never any states between peers or even race-conditions between JSON-RPC and HTTP where this could fail?

Not without looking into this closely. But that could indeed be a problem.

There is another problem that I think neither of our approaches solves: JSON-RPC unconditionally deletes objects with cascade = true (which is a whole new source of inconsistency on its own, in case the HTTP API originally deleted with cascade = false).

That doesn't immediately sound like that much of a problem. Isn't it a prerequisite for the config to already be out of sync for this to make a difference? If deletion with cascade=false succeeded on one node, this implies there are no objects that depend on the deleted object. If deleting with cascade=true on the second node now deletes more, it should only delete objects that didn't even exist on the first one.

Copy link
Copy Markdown
Contributor Author

@jschmidt-icinga jschmidt-icinga May 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it a prerequisite for the config to already be out of sync for this to make a difference?

One example (true or false doesn't even really matter):

  • On ep1 an object obj1 is successfully deleted (either with no children or cascade = true)
  • Immediately after deletion, obj1 is recreated and also a child object, obj2
  • ep2 also creates obj2 as a child of the old obj1 that hasn't been deleted yet
  • The deletion of obj1 is received by ep2, it deletes obj1 and obj2 (but with the current timestamp)
  • ep2 receives the creation of new obj1 and executes it
  • ep2 receives the creation of new obj2 and ignores it because it was created after the cascading delete updated the deletion timestamp

Not sure how realistic that is, but deleting and recreating objects is pretty common, and obj2's creation could be something that triggers independently on both endpoints (I'm thinking about something like downtimes, but not sure if that would actually automatically trigger on both eps). It could probably be forced via the HTTP API.

PS: To solve this, one option would be to somehow pass the timestamp through ConfigObjectUtility as part of the cookie, which gets propagated to child deletions.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • ep2 also creates obj2 as a child of the old obj1 that hasn't been deleted yet

Isn't that the point where the configuration goes out of sync already?

I think the unfortunate reality is that Icinga 2 configuration just isn't a distributed transactional database and fixing all possible edge cases is currently out of scope.

I'm thinking about something like downtimes, but not sure if that would actually automatically trigger on both eps

Normally it shouldn't. One should handle the downtime and then sync it to the other one.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that the point where the configuration goes out of sync already?

Ideally it wouldn't. ep2 would sync obj2 to ep1 and things stay consistent.

and fixing all possible edge cases is currently out of scope.

It probably doesn't matter for the concrete issue we're trying to solve in this PR, which is why I didn't attempt a fix in my last push. We can table it for now, but I think this is realistic enough that it warrants at least taking it down as another issue so we don't forget about it. I'll do that once this PR is merged.

Comment thread lib/remote/apilistener.cpp
Comment thread lib/remote/apilistener-configsync.cpp Outdated
Comment thread lib/remote/apilistener-configsync.cpp Outdated
Comment thread lib/remote/apilistener-configsync.cpp Outdated
@jschmidt-icinga jschmidt-icinga force-pushed the remember-deleted-api-objects branch from 7f60c60 to eb74fc6 Compare May 8, 2026 06:50
Comment thread lib/base/dictionary.cpp
Comment thread lib/remote/apilistener-configsync.cpp Outdated
Comment thread lib/remote/apilistener-configsync.cpp Outdated
Comment thread lib/remote/apilistener-configsync.cpp
Comment thread lib/remote/apilistener-configsync.cpp
Comment thread lib/remote/apilistener-configsync.cpp
Comment thread lib/remote/apilistener-configsync.cpp Outdated
Comment on lines +461 to +462
auto deletionTime = Utility::GetTime();
GetDeletedRuntimeObjects()->Set(object->GetReflectionType()->GetName() + ":" + object->GetName(), deletionTime);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • ep2 also creates obj2 as a child of the old obj1 that hasn't been deleted yet

Isn't that the point where the configuration goes out of sync already?

I think the unfortunate reality is that Icinga 2 configuration just isn't a distributed transactional database and fixing all possible edge cases is currently out of scope.

I'm thinking about something like downtimes, but not sure if that would actually automatically trigger on both eps

Normally it shouldn't. One should handle the downtime and then sync it to the other one.

@jschmidt-icinga jschmidt-icinga force-pushed the remember-deleted-api-objects branch from eb74fc6 to 8fe26b3 Compare May 8, 2026 12:15
@jschmidt-icinga jschmidt-icinga force-pushed the remember-deleted-api-objects branch from 8fe26b3 to bfafef7 Compare May 8, 2026 12:20
@jschmidt-icinga jschmidt-icinga force-pushed the remember-deleted-api-objects branch from bfafef7 to 1d54945 Compare May 11, 2026 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working cla/signed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

API object deletion unreliable if HA zone is not connected

2 participants