Skip to content

fix: E2E bug fixes — PS 7.3+ jq quoting, deploy parity, recipe fixes#171

Merged
vyomnagrani merged 17 commits into
microsoft:mainfrom
dm-chelupati:fix/e2e-bugs
May 15, 2026
Merged

fix: E2E bug fixes — PS 7.3+ jq quoting, deploy parity, recipe fixes#171
vyomnagrani merged 17 commits into
microsoft:mainfrom
dm-chelupati:fix/e2e-bugs

Conversation

@dm-chelupati
Copy link
Copy Markdown
Collaborator

Summary

Comprehensive fixes discovered during E2E testing of all recipes across bash, PowerShell, azd, bicep, and terraform backends.

Key Changes

PowerShell 7.3+ jq Compatibility (largest change)

  • New: Invoke-Jq.ps1 — shared wrapper that writes jq filters to temp files and uses jq -f, bypassing all PS 7.3+ native argument mangling
  • Converted all PS scripts (New-Agent, Assemble-Agent, Verify-Agent, Diff-Agent, Export-Agent, Deploy-Tf) to use Invoke-Jq for complex filters
  • Replaced --argjson with --slurpfile pattern for safe JSON variable passing
  • Fixed terraform -out=tf.plan arg splitting in Deploy-Tf.ps1
  • Added SubscriptionId alias to Apply-Extras.ps1 -Subscription parameter

Deploy & Clone Fixes

  • clone-agent.sh: added --backend terraform flag
  • deploy.sh: fixed unbound array errors on bash 3.2 (macOS)
  • apply-extras.sh: normalize short org/repo URLs to full GitHub URLs
  • Connector health check after deploy
  • Copy extras file next to clone params so deploy.sh finds it

Recipe Fixes

  • Set dynatrace-mcp recipe accessLevel to Low
  • Renamed bare recipe → minimal
  • Fixed azd provider + env handling, no-op infra

Testing

  • All 4 recipes tested: minimal, azmon-lawappinsights, dynatrace-mcp, pagerduty-law-vmcosmos
  • All pass: New-Agent, Deploy-Agent -DryRun, Deploy-Agent -WhatIf_, Deploy-Tf -DryRun
  • Tested on PS 7.5.1/macOS and bash 3.2/5.2

Files Changed (8 PS files + multiple bash scripts)

  • bin/ps/Invoke-Jq.ps1 (new)
  • bin/ps/New-Agent.ps1, Export-Agent.ps1, Verify-Agent.ps1, Diff-Agent.ps1, Deploy-Tf.ps1
  • bicep/Assemble-Agent.ps1, bicep/Apply-Extras.ps1
  • bin/deploy.sh, bin/clone-agent.sh, bicep/apply-extras.sh
  • Recipe configs and READMEs

…xport-Agent null checks, PD connector count

Bug fixes:
- clone-agent.sh: Use ${OVERRIDES[@]+...} pattern for bash 3.2 compat with set -u
- All PS scripts: Add PSNativeCommandArgumentPassing=Legacy for PS 7.3+
  (fixes Deploy-Tf.ps1 'Too many command line arguments' from terraform)
- Export-Agent.ps1: Add null fallback checks for skillNames, saNames, hookNames, promptNames
  (fixes 'jq: invalid JSON text passed to --argjson' during expected-config generation)
- PD recipe expected-config.json: Add 3 knowledge-source connectors
  (actual=4 vs expected=1 — knowledge files are counted as connectors)

Test scripts:
- azd test scripts: Copy config to ./agents/<name>/ before azd up
  (fixes 'No config and RECIPE not set' in preprovision hook)
- Add all 15 e2e test scripts + run-all-e2e.sh + results report
- azure.yaml: add infra.provider=custom so azd skips looking for infra/main.bicep
- azd test scripts: use 'azd env select || azd env new' to handle pre-existing envs
…scripts

- azure.yaml: replace unsupported 'provider: custom' with no-op bicep stub (works all azd versions)
- infra/main.bicep: no-op bicep file for azd compatibility
- bin/deploy.sh: fix CLEANUP_FILES[@] unbound variable on bash 3.2 (macOS)
- recipes/azmon: set expected repos to [] (skip GitHub for E2E)
- bin/install-prerequisites.sh: auto-install prereqs for macOS/Linux
- bin/ps/Install-Prerequisites.ps1: auto-install prereqs for Windows
- README.md: add install-prerequisites as first step
- run-all-e2e.sh: add SKIP_PS=1 filter
- test scripts: remove githubRepo (test GitHub separately)
- test scripts: increment agent/RG names to '2' suffix
Main README:
- Move 'What Gets Deployed' table to shared section (was Terraform-only)
- Remove 'swedencentral for fastest provisioning' comment
- Add bare recipe to recipes table

All recipe READMEs:
- Add Prerequisites section with install-prerequisites.sh --check
- Clarify what happens when optional params are blank (connector disabled)
- Distinguish appInsightsId (ARM resource ID) vs appInsightsAppId (GUID)
- Standardize Advanced Options table with defaults column
- Convert 'What You Get' from bullet list to table
- Add 'After Deploy' section with portal verification steps
- Rename 'Clone' to 'Clone an Existing Agent' with better instructions
- Fix azd link to point to #azure-developer-cli-azd anchor

New bare recipe:
- Minimal agent with just infra + RBAC + safety-rules prompt
- No connectors, skills, or automations
- README explains how to add capabilities post-deploy
- Points users to other recipes as examples
The repos dataplane API now validates URL format and rejects short
'org/repo' values. Prepend https://github.com/ when the URL doesn't
start with http and contains a slash.
When clone-agent.sh calls deploy.sh with a temp .parameters.json,
deploy.sh couldn't find the matching .extras.json for its summary
and auto-apply. Now we copy the extras file alongside the params.
Also prevents duplicate apply-extras runs.
The URL normalization fix was only in the first repo loop. There's a
second repo PUT loop in the OAuth-wait-and-retry path that also
needs the org/repo → full URL fix.
Allows cloning with Terraform instead of Bicep:
  clone-agent.sh --from-agent X --backend terraform ...

This calls deploy-tf.sh with the exported config directory so users
get terraform.tfstate they can version control.
az ... | tee can fail with SIGTERM/BrokenPipeError causing deploy.sh
to report failure even when the ARM deployment succeeded. Write to
tmpfile directly, then cat. Falls back to querying ARM if output
is corrupt.
ARM reports Succeeded even when a connector fails to authenticate
(e.g. missing DT_TOKEN). Now deploy.sh queries each connector's
provisioningState and warns if any are not healthy.
… Backend flag

- Assemble-Agent.ps1: resolve python3 to real executable (Windows Store
  stub returns exit 9009). Verify 'Python 3' in --version output.
- Assemble-Agent.ps1: replace jq '// ""' with '// empty' to avoid
  PowerShell quote mangling with jq 1.8 on Windows.
- Apply-Extras.ps1: add URL normalization for repo URLs in both code
  paths (prepend https://github.com/ to short org/repo format).
- Clone-Agent.ps1: add -Backend parameter (bicep|terraform) to route
  to Deploy-Tf.ps1 when terraform is specified.
All recipes should default to Low access + Review mode as best practice.
PowerShell 7.3+ changed native command argument passing, breaking jq
filters with commas, //, semicolons, and --argjson values.

- Add Invoke-Jq.ps1: shared wrapper that writes filters to temp files
  and uses jq -f, bypassing all PS argument mangling
- Convert all PS scripts to use Invoke-Jq for complex filters
- Replace --argjson with --slurpfile pattern for JSON variables
- Replace --argjson booleans/numbers with --arg + jq-side conversion
- Fix Assemble-Agent.ps1 pipeline slurp bug (join inputs before piping)
- Add SubscriptionId alias to Apply-Extras.ps1 -Subscription parameter

Affected: Invoke-Jq.ps1 (new), New-Agent.ps1, Assemble-Agent.ps1,
Verify-Agent.ps1, Diff-Agent.ps1, Export-Agent.ps1, Deploy-Tf.ps1,
Apply-Extras.ps1

Tested: all 4 recipes (minimal, azmon, dynatrace, pagerduty) pass
New-Agent, Deploy DryRun, and Deploy WhatIf on PS 7.5.1/macOS.
Copy link
Copy Markdown
Collaborator

@vyomnagrani vyomnagrani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vyomnagrani vyomnagrani merged commit 7959f5a into microsoft:main May 15, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants