Skip to content

fix(vm_collect,vm_hot_plug): fix label selector and memory bugs#8

Merged
sabre1041 merged 7 commits into
redhat-cop:mainfrom
stevefulme1:fix/vm-collect-and-hotplug-bugs
May 5, 2026
Merged

fix(vm_collect,vm_hot_plug): fix label selector and memory bugs#8
sabre1041 merged 7 commits into
redhat-cop:mainfrom
stevefulme1:fix/vm-collect-and-hotplug-bugs

Conversation

@stevefulme1

Copy link
Copy Markdown
Contributor

Summary

  • vm_collect role: Fix two copy-paste bugs from vm_backup_restore that break VM collection when using label_selectors
  • vm_hot_plug role: Fix wrong condition in compute patch template that prevents memory-only hotplug

Bug Details

Bug 1 — Wrong variable names in task name (vm_collect/tasks/main.yml:48)

The label selector block's task name referenced vm_backup_restore_collect_obj and collect_obj_default_kind — leftover from the vm_backup_restore role. Both are undefined in the vm_collect context, causing a variable error when the block evaluates.

Fix: Changed to vm_collect_obj | default(vm_collect_obj_default_kind) to match the rest of the role.

Bug 2 — Wrong loop variable in set_fact (vm_collect/tasks/main.yml:64)

The set_fact in the label_selectors block referenced vm_collect_no_label_selectors_response (the loop_var from the other block) instead of vm_collect_label_selectors_response. This means VMs queried via label selectors were never added to the collection list — the variable was undefined, so the task would error or produce empty results.

Fix: Changed vm_collect_no_label_selectors_responsevm_collect_label_selectors_response.

Bug 3 — Wrong condition for memory hotplug (vm_hot_plug/templates/compute_patch.yml.j2:13)

The Jinja2 template checked 'cpu' in vm_hot_plug_compute before rendering the memory patch (/spec/template/spec/domain/memory/guest). This caused:

  • Memory-only hotplug silently skipped (condition false)
  • CPU-only hotplug to fail (tries to render vm_hot_plug_compute['memory'] which is undefined)

Fix: Changed condition to 'memory' in vm_hot_plug_compute.

Test plan

  • Run vm_hot_plug with label_selectors in the request — VMs should now be collected and patched
  • Run vm_hot_plug with CPU-only compute change — should patch CPU without attempting memory
  • Run vm_hot_plug with memory-only compute change — should patch memory correctly
  • Run vm_hot_plug with both CPU and memory — both patches applied
  • Run vm_hot_plug with namespace/names (no label_selectors) — verify no regression

🤖 Generated with Claude Code

@stevefulme1

Copy link
Copy Markdown
Contributor Author

Observed Scenario

A user reported CPU hotplug failing via AAP with the following job template vars:

vars:
  openshift_host: "{{ openshift_host }}"
  openshift_api_key: "{{ openshift_api_key }}"
  openshift_verify_ssl: "{{ openshift_verify_ssl | default(false) | bool }}"
  vm_hot_plug_request:
    - namespace: "{{ vm_namespace }}"
      names:
        - "{{ vm_name }}"
      compute:
        cpu:
          cores: "{{ cpu_cores | int }}"
      restartIfRequired: "{{ restart_if_needed | default(false) | bool }}"

Observed behavior:

  • "Initialize Variables" runs fine (vm_hot_plug_vms: [])
  • "Invoke Collect VM Role" skips — "skipped_reason": "No items in the list"
  • "Process Hot Plug VM" skips — "skipped_reason": "No items in the list"

Analysis:

This scenario uses namespace + names (no label_selectors), so it takes the "without label selector" path in vm_collect. The three bugs fixed in this PR are real and would compound the problem:

  1. Bug add ci workflow #1 (task name) — If Ansible strict mode or ANSIBLE_ACTION_WARNINGS is enabled, the undefined vm_backup_restore_collect_obj reference in the label selector block's task name could cause failures even when that block is skipped during template evaluation.
  2. Bug fix: trigger first release with ci workflow #2 (wrong loop var) — Would break any request that does use label_selectors, since collected VMs would never be added to the list.
  3. Bug chore: fix lint issues, add gitleaks pre-commit hook #3 (memory condition) — Would hit this user the moment they attempt memory-only hotplug or would cause a failure if they set CPU without memory (template tries to render undefined memory value).

For the specific "No items in the list" skip on vm_hot_plug_request, the root cause may also involve AAP variable precedence — the vars need to reach the vm_hot_plug role scope. If vm_hot_plug_request is defined at the playbook include_role vars level but the survey/extra vars substitution ({{ vm_namespace }}, {{ vm_name }}, etc.) isn't resolving, the list could evaluate to empty or the role could fall back to defaults/main.yml which defaults to [].

@stevefulme1 stevefulme1 changed the title fix(vm_collect, vm_hot_plug): fix copy-paste bugs breaking label selectors and memory hotplug fix(vm_collect, vm_hot_plug): fix label selector and memory bugs Apr 15, 2026
@sabre1041 sabre1041 changed the title fix(vm_collect, vm_hot_plug): fix label selector and memory bugs fix: (vm_collect, vm_hot_plug): fix label selector and memory bugs [skip ci] May 1, 2026
@stevefulme1 stevefulme1 changed the title fix: (vm_collect, vm_hot_plug): fix label selector and memory bugs [skip ci] fix(vm_collect,vm_hot_plug): fix label selector and memory bugs May 2, 2026
sabre1041
sabre1041 previously approved these changes May 5, 2026

@sabre1041 sabre1041 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sabre1041 sabre1041 self-requested a review May 5, 2026 03:31
@sabre1041

Copy link
Copy Markdown

LGTM

Omitted seeing conflicted file

@stevefulme1 would you be able to review the conflict in the test-requirements.txt file?

stevefulme1 and others added 7 commits May 5, 2026 15:31
…ctor queries and memory hotplug

Three bugs fixed:

1. vm_collect: task name referenced vm_backup_restore_collect_obj and
   collect_obj_default_kind (leftover from backup_restore role), causing
   undefined variable errors when using label_selectors. Fixed to use
   vm_collect_obj and vm_collect_obj_default_kind.

2. vm_collect: set_fact in the label_selectors block referenced
   vm_collect_no_label_selectors_response (the loop_var from the other
   block) instead of vm_collect_label_selectors_response. This caused
   VMs queried via label_selectors to never be added to the collection,
   resulting in an empty VM list downstream.

3. vm_hot_plug: compute_patch.yml.j2 checked for 'cpu' instead of
   'memory' when deciding whether to patch domain/memory/guest. Memory-
   only hotplug was silently skipped; cpu-only hotplug would fail trying
   to render an undefined memory value.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@stevefulme1

Copy link
Copy Markdown
Contributor Author

Rebased onto main and resolved the test-requirements.txt conflict. The upstream already had the tox-ansible<26.2.2 pin so the conflict was just a redundant tox line from the earlier iterative fixes. Also resolved a tox-ansible.ini conflict where upstream had added sh and ade to allowlist_externals.

@sabre1041 sabre1041 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sabre1041 sabre1041 merged commit 8489e0b into redhat-cop:main May 5, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants