[Agent] Manual rollback for agent upgrades#4918
[Agent] Manual rollback for agent upgrades#4918karenzone wants to merge 5 commits intoelastic:mainfrom
Conversation
✅ Vale Linting ResultsNo issues found on modified lines! The Vale linter checks documentation changes against the Elastic Docs style guide. To use Vale locally or report issues, refer to Elastic style guide for Vale. |
🔍 Preview links for changed docs |
|
@karenzone, notes to self:
|
| Earlier {{agent}} versions could detect issues and automatically roll back to the previous installed version within ten minutes of an upgrade if needed. | ||
| This feature is still available and on by default. | ||
|
|
||
| ::::{admonition} Elastic subscription |
There was a problem hiding this comment.
- Is the info about automatic rollbacks relevant, or too much?
- Does the automatic rollback feature require a different subscription level? I want to be sure that I get the admonition placed properly.
There was a problem hiding this comment.
this "feature" is news to me unfortunately :-(
I know that if the download fails, the upgrade fails and won't proceed but that is not a rollback.
@cmacknz could you help me with this? is this just in case of an upgrade failure?
There was a problem hiding this comment.
Not @cmacknz but I will try to clarify:
Elastic Agent could already automatically rollback if some serious problems are detected after starting the new version (for example elastic-agent process crashing repeatedly or remaining in an unhealthy state).
The manual rollback feature is useful for more subtle issues that are not caught by the automatic process immediately after the upgrade.
There was a problem hiding this comment.
@nimarezainia, @pchila, @cmacknz, how do we want to resolve this?
There was a problem hiding this comment.
More info is not always better, especially if it could introduce confusion for users. If we decide the automatic rollback is too much, I can delete it.
There was a problem hiding this comment.
@karenzone let's not have this automatic rollback as part of the manual rollback that gets triggers by the user action. If we wanted to mention this automatic rollback I think it should be in with the upgrade section of the docs. WDYT?
|
cc:/ @pchila @jillguyonnet @nimarezainia, your early comments are welcome. |
|
@karenzone the content looks good to me. Could the "Roll back an Elastic Agent upgrade for Fleet Managed agents" section have a larger font size so it's at the same hierarchical level as "Update RPM and DEB system packages" - currently it feels as though it's part of the RPM and DEB section. |
pchila
left a comment
There was a problem hiding this comment.
Left a couple of suggestion (minor rewording which makes the content clearer, I hope)
Left a question about the Fleet API endpoint to use in order to rollback a single agent since it doesn't look right to me.
| Earlier {{agent}} versions could detect issues and automatically roll back to the previous installed version within ten minutes of an upgrade if needed. | ||
| This feature is still available and on by default. | ||
|
|
||
| ::::{admonition} Elastic subscription |
There was a problem hiding this comment.
Not @cmacknz but I will try to clarify:
Elastic Agent could already automatically rollback if some serious problems are detected after starting the new version (for example elastic-agent process crashing repeatedly or remaining in an unhealthy state).
The manual rollback feature is useful for more subtle issues that are not caught by the automatic process immediately after the upgrade.
Co-authored-by: Paolo Chilà <paolo.chila@elastic.co>
86792c2 to
acf74e8
Compare
jillguyonnet
left a comment
There was a problem hiding this comment.
LGTM overall, left a couple of small comments.
| :Call `POST /api/fleet/agents/{agentID}/rollback`. | ||
|
|
||
| To roll back multiple agents: | ||
| :Call `POST /api/fleet/agents/bulk_rollback`. |
There was a problem hiding this comment.
Do we already have links to the API reference? Probably not a huge deal if not in this case since there are no options, but bulk endpoints expect agents in the request body (either a list of agent ids or a query string, see e.g. bulk migrate), making that explicit could be helpful.
|
|
||
| The manual rollback feature for {{agent}} gives you the ability to roll back to the previously installed version if the install artifacts are still available on disk, typically seven days after the upgrade. | ||
|
|
||
| To roll back one or more {{agent}} upgrades: |
There was a problem hiding this comment.
FYI: in the case of a single agent, the rollback action menu item is only enabled if the agent has a valid, non-expired rollback, otherwise it is disabled. This could be helpful to know, as a disabled is an indication that a rollback is not available for this agent. In contrast, bulk actions are always enabled and will report errors for agents that did not have a valid rollback.
Summary
Adds docs for Manual rollback for agent upgrades.
If a user upgrades an agent running v9.3.0 or later, they have the option to roll back to the previously installed version for seven days.
Fixes: elastic/elastic-agent#11536
Generative AI disclosure
- Yes (content refinement)
- No
Copilot (Claude Sonnet 4.5)