Skip to content

Conversation

@mxj220
Copy link
Contributor

@mxj220 mxj220 commented Feb 5, 2026

What this PR does / why we need it:
Tracks waagent latest version in components.json via renovate and installs it in pre-install-dependencies.sh.
Which issue(s) this PR fixes:

Fixes #

@mxj220 mxj220 requested a review from yewmsft as a code owner February 5, 2026 21:42
Copilot AI review requested due to automatic review settings February 5, 2026 21:42
@github-actions github-actions bot added the components This pull request updates cached components on Linux or Windows VHDs label Feb 5, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds automatic tracking and installation of WALinuxAgent (waagent) from GitHub during cluster provisioning. The implementation allows AKS to use specific waagent versions independent of OS package repository versions.

Changes:

  • Added walinuxagent package definition to components.json with version 2.15.0.1
  • Configured Renovate to automatically update walinuxagent from GitHub releases
  • Implemented installation script that downloads, extracts, and installs waagent from GitHub source
  • Integrated waagent installation early in CSE basePrep phase (skipped on FLATCAR)
  • Added waagent.log collection in e2e tests
  • Updated test data (binary blobs) to reflect new CSE behavior

Reviewed changes

Copilot reviewed 36 out of 70 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
parts/common/components.json Added walinuxagent package definition with GitHub source and version tracking
.github/renovate.json Added custom manager for auto-updating walinuxagent from GitHub releases
parts/linux/cloud-init/artifacts/cse_install.sh Implemented installWALinuxAgent function with download, installation, and service restart logic
parts/linux/cloud-init/artifacts/cse_main.sh Integrated waagent installation in basePrep phase with FLATCAR exclusion
e2e/vmss.go Added waagent.log extraction for e2e debugging
pkg/agent/testdata/* Updated binary test data files (cannot review content)

@mxj220
Copy link
Contributor Author

mxj220 commented Feb 9, 2026

Just for my understanding, the here plan is to download and update WaLinuxAgent during VHD build, however during nodePrep, if there is a new WaLinuxAgent available and if some code path does apt upgrade, there is a chance of it getting upgraded ?

@SriHarsha001 Yeah, we want the VHD to start with the latest version at the time of release, and if a newer version is published later on, it can still be pulled via waagent's built-in ability to auto-upgrade when it starts up on the VM.

"current": {
"versionsV2": [
{
"renovateTag": "github=Azure/WALinuxAgent",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Renovate really able to detect and automatically pick up new updates for walinuxagent?
It's also fine if you want to test it after merging this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be able to. I am not an expert on it, though. The immediately important task is installing a modern version that supports FIPS 140-3 out of the box (2.15.0.1). But the renovate functionality will be tested once another version is released.

Copilot AI review requested due to automatic review settings February 9, 2026 23:18
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

"walinuxagent")
if isFlatcar "$OS"; then
echo " - walinuxagent installation skipped on Flatcar; using image-provided version" >> ${VHD_LOGS_FILEPATH}
else
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the if else ? If in the end we simply print a message ? In both cases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not required for any reason other than clarity

if isMarinerOrAzureLinux "$OS"; then
echo "Removing OS-packaged WALinuxAgent RPM before source install..."
dnf remove -y WALinuxAgent --noautoremove || true
fi
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don’t uninstall on Linux ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was misled by trying to debug an osguard issue. dnf/apt remove should not be necessary

Copilot AI review requested due to automatic review settings February 10, 2026 18:47
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

elif systemctl list-unit-files waagent.service | grep -q waagent; then
if ! systemctl restart waagent.service; then
exit 1
fi
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The service restart logic has a potential issue: if neither walinuxagent.service nor waagent.service is found, the script continues silently without restarting the service. This could leave waagent running with the old version or not running at all. Consider adding an else branch to fail with a clear error message if no recognized service file is found after installation.

Suggested change
fi
fi
else
echo "Error: No recognized WALinuxAgent/waagent systemd service unit found after installation; unable to restart agent." >&2
exit 1

Copilot uses AI. Check for mistakes.
Comment on lines 94 to 97
if ! python3 setup.py install --register-service; then
popd > /dev/null || exit 1
exit 1
fi
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The waagent installation uses python3 setup.py install --register-service, which depends on python3 being available and the setup.py script working correctly across different OS distributions. While Ubuntu and Azure Linux both have python3, there could be subtle differences in Python package installation behavior between distributions.

Additionally, the --register-service flag is WALinuxAgent-specific and registers systemd service files. Verify that this works correctly on both Ubuntu and Azure Linux distributions. If either OS uses a different init system or has different systemd configuration, this could fail silently.

Consider adding explicit checks for:

  1. Python3 availability
  2. Setup.py execution success
  3. Service file creation success before attempting restart

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +921 to +943
testWaagentInstalled() {
local test="testWaagentInstalled"
echo "$test:Start"
# walinuxagent is not installed from source on Flatcar or OS Guard
if [ "$OS_SKU" = "Flatcar" ] || [ "$OS_SKU" = "AzureLinuxOSGuard" ]; then
echo "$test: Skipping for ${OS_SKU} as waagent is not installed from source"
echo "$test:Finish"
return
fi
local expectedVersion="$1"
# walinuxagent is installed from source during VHD build and the download artifacts are cleaned up,
# so we verify installation by checking the installed version matches what components.json specifies
local installedVersion
installedVersion=$(waagent --version 2>/dev/null | head -n1 | sed -n 's/.*WALinuxAgent-\([0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*\(\.[0-9][0-9]*\)*\).*/\1/p' || true)
if [ -z "$installedVersion" ]; then
err $test "walinuxagent is not installed or waagent --version failed"
elif [ "$installedVersion" != "$expectedVersion" ]; then
err $test "walinuxagent version mismatch: installed ${installedVersion}, expected ${expectedVersion}"
else
echo "$test walinuxagent version ${installedVersion} is installed correctly"
fi
echo "$test:Finish"
}
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The waagent test only validates that the version matches, but doesn't verify:

  1. That the waagent service is actually running after installation
  2. That waagent.conf contains the expected AKS log collector configuration
  3. That the service restart succeeded during VHD build

Consider enhancing the test to check:

  • Service status: systemctl is-active walinuxagent.service or systemctl is-active waagent.service
  • Config file contains expected settings: grep -q "Logs.Collect=n" /etc/waagent.conf
  • Waagent can communicate with Azure fabric (basic functionality test)

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings February 10, 2026 22:41
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

# walinuxagent is installed from source during VHD build and the download artifacts are cleaned up,
# so we verify installation by checking the installed version matches what components.json specifies
local installedVersion
installedVersion=$(waagent --version 2>/dev/null | head -n1 | sed -n 's/.*WALinuxAgent-\([0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*\(\.[0-9][0-9]*\)*\).*/\1/p' || true)
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern for extracting the WALinuxAgent version from waagent --version output is fragile. The pattern s/.*WALinuxAgent-\([0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*\(\.[0-9][0-9]*\)*\).*/\1/p assumes:

  1. The version string always appears after "WALinuxAgent-"
  2. Each version component has at least 2 digits (e.g., "01" instead of "1")

However, actual WALinuxAgent version output may vary:

  • It might be "WALinuxAgent-2.15.0.1" (works)
  • It might be "WALinuxAgent-2.15.0.10" (works because * allows 2+ digits)
  • But pattern won't match if components have only 1 digit like "WALinuxAgent-2.5.0.1"

The pattern should be: s/.*WALinuxAgent-\([0-9]\+\.[0-9]\+\.[0-9]\+\(\.[0-9]\+\)*\).*/\1/p to match 1 or more digits per component, not 2+ digits.

Suggested change
installedVersion=$(waagent --version 2>/dev/null | head -n1 | sed -n 's/.*WALinuxAgent-\([0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*\(\.[0-9][0-9]*\)*\).*/\1/p' || true)
installedVersion=$(waagent --version 2>/dev/null | head -n1 | sed -n 's/.*WALinuxAgent-\([0-9]\+\.[0-9]\+\.[0-9]\+\(\.[0-9]\+\)*\).*/\1/p' || true)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

components This pull request updates cached components on Linux or Windows VHDs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants