refactor: migrate project endpoint commands to new scaffold#8243
refactor: migrate project endpoint commands to new scaffold#8243huimiu wants to merge 10 commits into
Conversation
….ai.agents Migrate the project endpoint set/unset/show logic introduced in #8162 (originally added to `azure.ai.agents` as `azd ai agent project ...`) into the new `azure.ai.projects` extension. Subcommands hang directly off the extension root (which is already `project`), so users get: - `azd ai project set <endpoint>` — persist a default Foundry project endpoint - `azd ai project unset` — clear the persisted endpoint - `azd ai project show` — show the resolved endpoint and source Key adjustments vs source PR: - Module path `azure.ai.projects` - Config namespace `extensions.ai-projects.context` (independent of the agents extension's store) - Suggestion strings reference `azd ai project set` - 5-level resolver split into its own `project_resolver.go` The resolver still implements the spec'd cascade: flag → active azd env (`AZURE_AI_PROJECT_ENDPOINT`) → global config → host `FOUNDRY_PROJECT_ENDPOINT` → structured `missing_project_endpoint` error. Invalid values at any level are hard validation errors (no silent fallback). A minimal `internal/exterrors` package is introduced with only the `Validation` / `Dependency` factories and codes required by the migrated commands.
📋 Prioritization NoteThanks for the contribution! The linked issue isn't in the current milestone yet. |
There was a problem hiding this comment.
Pull request overview
This PR migrates Foundry project endpoint persistence and resolution into the azure.ai.projects extension, adding top-level set, unset, and show commands plus validation, config storage, and tests.
Changes:
- Adds endpoint validation and 5-level resolution cascade.
- Adds global config persistence helpers and project endpoint commands.
- Adds command/unit tests and updates module dependencies.
Reviewed changes
Copilot reviewed 17 out of 18 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
internal/exterrors/errors.go |
Adds structured local error helpers. |
internal/exterrors/codes.go |
Defines endpoint-related error codes. |
internal/cmd/root.go |
Registers set, unset, and show commands. |
internal/cmd/project_set.go |
Implements endpoint persistence command. |
internal/cmd/project_unset.go |
Implements endpoint clearing command. |
internal/cmd/project_show.go |
Implements endpoint display command. |
internal/cmd/project_endpoint.go |
Adds endpoint validation and source types. |
internal/cmd/project_resolver.go |
Adds endpoint resolution cascade. |
internal/cmd/project_context_store.go |
Adds global config read/write/clear helpers. |
internal/cmd/extension_context.go |
Adds nil-safe extension context helper. |
internal/cmd/*_test.go |
Adds tests for commands, resolver, validation, and flag metadata. |
go.mod |
Updates direct/indirect dependencies. |
go.sum |
Removes unused module checksums. |
wbreza
left a comment
There was a problem hiding this comment.
Thanks @huimiu! Nice clean structure on the cascade resolver and the test coverage on validation is strong. A few things worth resolving before this lands — most are architectural questions about how this fits alongside azure.ai.agents, plus a couple of correctness items.
High
1. Is this actually a migration?
The PR title says "migrate project endpoint commands from azure.ai.agents", but the old files in cli/azd/extensions/azure.ai.agents/internal/cmd/project_{set,show,unset,endpoint,context_store}.go are unchanged. After this lands, both extensions expose the same project set/show/unset commands.
Could you clarify the intent?
- If this is a true migration: consider removing the commands from
azure.ai.agents(and the duplicatedinternal/exterrorspackage) in this PR, along with a CHANGELOG note pointing users to the new location. - If
azure.ai.agentsis staying around for a while: could the PR title and description be updated to reflect "add project endpoint commands toazure.ai.projects", with a short note in both extensions' READMEs explaining which to prefer? Otherwise reviewers and users won't know which one is canonical.
2. Persisted config key change risks silent data loss
The old extension persists to extensions.ai-agents.project.context and the new one persists to extensions.ai-projects.context (note: the new path drops .project, so it's not just a namespace rename). Users with an existing persisted endpoint in the old key will see "no endpoint set" in the new extension with no signal that anything was lost.
Two options worth considering:
- One-time legacy read: when the new resolver hits level 3 (global config) and finds nothing, fall back to reading the old
extensions.ai-agents.project.contextkey and, if present, surface a note like "found endpoint persisted byazure.ai.agents; runazd ai project set <endpoint>to migrate." - Or explicitly call out the re-
setrequirement in the PR description / release notes so this is a known break rather than a silent one.
3. internal/exterrors is duplicated rather than shared
The new internal/exterrors is a thin subset of the identical package in azure.ai.agents/internal/exterrors. Both will need parallel maintenance for any future factory or code. If a shared location exists (or one can be added under cli/azd/extensions/shared/), importing from there would avoid the divergence. If sharing is intentionally out of scope for this PR, a short comment in the new package noting "subset of azure.ai.agents/internal/exterrors; consolidation tracked in #..." would help future maintainers.
4. Suggestion string is built by concatenation in project_show.go
return exterrors.Dependency(
exterrors.CodeMissingProjectEndpoint,
localErr.Message,
"run `azd ai project set <endpoint>` to persist a default, or " + localErr.Suggestion,
)If localErr.Suggestion is empty the user sees a trailing "…persist a default, or ". If non-empty, two suggestion strings get glued together with potentially mismatched grammar. Prefer a single canonical suggestion authored at this call site, and either ignore localErr.Suggestion or branch on it explicitly.
Medium
5. fmt.Errorf("...: %w", err) wraps NewAzdClient() errors
In project_set.go and project_unset.go:
azdClient, err := azdext.NewAzdClient()
if err != nil {
return fmt.Errorf("failed to create azd client: %w", err)
}Per the extensions error-handling contract (see azure.ai.agents/AGENTS.md), wrapping with fmt.Errorf defeats classification by host middleware (status.FromError, errors.AsType[*LocalError]) — the wrapper string gets surfaced instead of the structured error's category/code. Returning the error unchanged, or classifying it to a structured exterrors.* here, is the documented pattern. Same applies in any sibling that wraps gRPC/structured errors.
6. Test seam mutates a package-level function pointer
// project_resolver.go
var readAzdHostedSourcesFunc = readAzdHostedSources
// project_resolver_test.go — stubAzdHostedSources mutates the globalNo race today because the resolver tests don't use t.Parallel() — but other tests in the same package do, and given the table-driven structure here it's very tempting to add. The moment a parallel test calls stubAzdHostedSources you'll get a data race under -race. Safer: inject readAzdHostedSources via the existing resolveProjectEndpointOpts struct so each test owns its own seam.
7. unset idempotent path reads as a failure
When no endpoint is set, unset prints "No active project endpoint to clear." and returns "cleared": false in JSON. The command succeeded (it was idempotent), but both the message and the boolean read like rejections. Renaming the field (hadPreviousEndpoint or wasSet) and/or rephrasing the message to confirm the no-op state would make automation and users happier.
Notes (non-blocking)
- The 5-level cascade is well-factored and the design spec correctly counts the hard-error as level 5 — no change needed there.
- Security pass came back clean: HTTPS-only enforced, host allowlist (
.services.ai.azure.com), explicit ports rejected, and crucially all four input sources are re-validated before use, so an invalid intermediate level errors hard instead of silently falling through. Nice. - Related to #2 above and the existing Copilot inline on
project_context_store.go: while touchingclear/getProjectContext, consider making theunsetpath best-effort when reading the previous value, so a malformed persisted blob can always be cleared (the bot's suggestion).
Code-correctness fixes (azure.ai.projects): - clearProjectContext now reads the previous endpoint best-effort so a malformed/older persisted blob cannot block 'unset' from clearing it. - project_show no longer concatenates the cascade-error suggestion onto its own; the structured noProjectEndpointError already includes the 'azd ai project set' guidance, so the resolver error is returned unchanged. - NewAzdClient() failures are now classified with exterrors.Dependency(CodeAzdClientFailed, ...) instead of fmt.Errorf, so azd host classification (status.FromError / errors.AsType[*LocalError]) is preserved. - resolveProjectEndpointOpts now carries an injectable ReadAzdHostedSources seam; the package-level readAzdHostedSourcesFunc global is gone, so resolver tests no longer mutate shared state. - The 'unset' JSON shape renames 'cleared' to 'previouslySet' and the no-op message now reads as confirmation, not failure. - exterrors package gets a doc note explaining the duplication with azure.ai.agents/internal/exterrors and pointing at future consolidation. Migration (azure.ai.agents): - Removed project, project_set, project_unset, project_show and their tests; removed the rootCmd.AddCommand(newProjectCommand(extCtx)) registration; pruned now-dead setProjectContext/clearProjectContext helpers. - CHANGELOG entry added directing users to 'azd ai project ...'. Backward-compat (azure.ai.projects): - The resolver now reads the legacy 'extensions.ai-agents.project.context' key as a one-time fallback when the new 'extensions.ai-projects.context' key has no value. Legacy values surface as SourceGlobalConfig with a FromLegacyAgentsConfig flag, and project_show prints a migration notice on stderr (plus emits the flag in JSON output).
…ontext key The agents-internal project endpoint resolver previously read only extensions.ai-agents.project.context (the legacy key written by the now-removed 'azd ai agent project set' command). After the command migrated to 'azd ai project set' in azure.ai.projects, that writes to extensions.ai-projects.context — leaving agent commands (run, invoke, etc.) unable to resolve an endpoint persisted by the new command. getProjectContext now prefers the new key and falls back to the legacy key, with best-effort error handling on the legacy read so a malformed legacy blob cannot block resolution from the new key, explicit flags, or FOUNDRY_PROJECT_ENDPOINT. CHANGELOG updated to call this out.
- Revert all CHANGELOG.md edits made in this PR per maintainer request.
- Audit and trim comments added by this PR:
- Drop the misleading 'parallel tests do not race on a shared global'
framing on the test-only stub helper; the resolver tests use
t.Setenv and don't t.Parallel, so the claim was forward-looking
only. Rewritten to describe what the helper actually does.
- Tighten the doc comments on:
* resolveProjectEndpointOpts.ReadAzdHostedSources
* resolvedEndpoint.FromLegacyAgentsConfig
* azdHostedSources.CfgFromLegacyAgents
* legacyAgentsContextPath
* getLegacyAgentsProjectContext
* clearProjectContext
* project_show JSON struct and Run error path
* exterrors package doc
* agents-side getProjectContext + projectContextState +
projectsExtensionContextPath + projectContextConfigPath
- All cuts remove historical narrative or restated obvious behavior;
accuracy-critical invariants (best-effort legacy reads, idempotent
unset, where each path is written/read) are retained.
wbreza
left a comment
There was a problem hiding this comment.
Re-reviewed the 4 follow-up commits — all of the prior High and Medium items are cleanly resolved, no new issues, no regressions. Nice work @huimiu.
Prior findings — status
- H1 (migration not copy) ✅ — old
azure.ai.agents/internal/cmd/project_*.go+ tests deleted,root.gono longer registersnewProjectCommand. - H2 (config key data loss) ✅ — bi-directional bridge in place:
azure.ai.projectsreads the legacyextensions.ai-agents.project.contextas fallback,azure.ai.agentsprefers the new key and falls back to legacy. Validation runs uniformly against either source (so a malformed/invalid legacy value still hard-errors, no surprise fallthrough). The follow-up commit "fix(ai.projects): clear legacy endpoint" extendingclearProjectContextto unset both keys is exactly the right catch — without it,set → unset → showwould resurrect the orphan. - H3 (exterrors duplication) ✅ — acknowledged via a package doc comment pointing at future consolidation. Reasonable trade-off for the scope of this PR.
- H4 (suggestion concat in
project_show) ✅ — resolver error is returned untouched; thenoProjectEndpointErroralready carries the actionable suggestion. - M1 (
NewAzdClientwrap) ✅ — newCodeAzdClientFailed+exterrors.Dependency(...)in bothsetandunset. Classification middleware will now see the structured error. - M2 (package-global test seam) ✅ —
ReadAzdHostedSourcesis now a per-call field onresolveProjectEndpointOpts; tests uset.Setenv. Comment honestly describes what the helper does rather than overclaiming race safety. - M3 (
unsetidempotent UX) ✅ — JSON field renamedpreviouslySet, no-op message reads as positive confirmation.
Verified non-regressions
- Cascade hard-fail behavior preserved across both keys (covered by
TestResolveProjectEndpoint_AzdEnvInvalidRejected,TestResolveProjectEndpoint_GlobalConfigInvalidRejected). FromLegacyAgentsConfigflag is wired consistently through both the JSON output and the stderr migration notice inproject_show.- New
project_context_store_test.gocovers the legacy-only and both-key clear paths. - Security pass is still clean: HTTPS-only, host suffix allowlist, port rejection, validate-at-every-level.
One small bit of polish I really appreciated: commit 3's audit that trimmed a "no race" framing on the test helper because the tests actually use t.Setenv and don't t.Parallel — better to describe what code does than what it might do later.
LGTM. 🚀
trangevi
left a comment
There was a problem hiding this comment.
Can you please confirm any necessary changes to these files for the extension as well:
I haven't gotten to go through and validate all of those for each of the extensions, and I believe they likely should be updated for consistency. This PR has a sample that I took that image from: #8264
| "`extensions.ai-agents.project.context` key written by the "+ | ||
| "removed `azd ai agent project set` command. Run "+ | ||
| "`azd ai project set <endpoint>` to migrate it to the "+ | ||
| "new `extensions.ai-projects.context` key.") |
There was a problem hiding this comment.
We should just do this step for the user, remove the old key and replace it with the new one
There was a problem hiding this comment.
Updated. The resolver now auto-migrates the legacy key to extensions.ai-projects.context and removes extensions.ai-agents.project.context on the first run that observes it, no manual azd ai project set needed. The migration is best-effort so a transient write failure never breaks the command in flight.
The resolver previously detected an endpoint persisted under the legacy `extensions.ai-agents.project.context` key and asked the user to re-run `azd ai project set` to migrate it. Address review feedback by performing the migration automatically the first time the legacy key is observed: copy the value to `extensions.ai-projects.context` and clear the legacy key. The write is best-effort so a transient config failure never breaks the command the user actually invoked. Update the `azd ai project show` notice to confirm the migration instead of prompting the user to run another command, and add unit tests for the new migrate helper covering happy path plus set/unset failure modes.
wbreza
left a comment
There was a problem hiding this comment.
Re-reviewed the new commit (cc74d1979e — feat(ai.projects): auto-migrate legacy ai-agents endpoint) since my prior approval. Just one carryover item from another reviewer to confirm before this lands.
Auto-migrate commit — looks great ✅
The implementation cleanly addresses @trangevi's "do it for the user" feedback on project_show.go:
- Resolver path is right:
readAzdHostedSourcesreads the new key first and only falls back to the legacy key when the new key is empty. When the legacy hits, the resolver kicks offmigrateLegacyAgentsProjectContextand continues with the in-memory value — the user-invoked command never blocks on a config write. writeMigratedProjectContextordering is correct: write new key first, then unset legacy. If the unset fails, both keys hold the same value, the next run re-migrates (overwrite + retry unset), idempotent.TestWriteMigratedProjectContext_UnsetFailureBubblesUpcaptures exactly this state.- Set-failure path preserves legacy:
TestWriteMigratedProjectContext_SetFailureLeavesLegacyKeyconfirms the legacy key isn't cleared when the new write fails — no data loss possible from a partial migration. FromLegacyAgentsConfigsemantics are honest: doc comment says "only true on the first run that observes the legacy key" and the implementation matches — run 1 reads from legacy → flag true → migration runs; run 2 reads from new key → flag false. JSON consumers and the stderr notice both get a deterministic one-shot signal.project_shownotice rewritten in past tense ("migrated this endpoint…") — matches the new auto-behavior, no longer asks the user to do anything.clearProjectContextFromConfigstill clears both keys —unsetafter the migration window won't leave orphans. Covered byTestClearProjectContextFromConfig_ClearsCanonicalAndLegacy.
The _ = migrateLegacyAgentsProjectContext(...) best-effort swallow is intentional and the right call given the resolver's contract (never break the command in flight). A log.Debug on the dropped error would help future diagnosis but is not blocking — the legacy fallback guarantees the next run retries.
Nothing else to flag on this commit.
Carryover — @trangevi CHANGES_REQUESTED still open ⚠️
The "validate consistency with the files in this image" ask from #8243 (review) doesn't appear to have been addressed yet. Comparing cli/azd/extensions/azure.ai.projects/ to the sibling extensions (azure.ai.agents is the source of this migration; azure.ai.inspector from PR #8264 is the reference trangevi pointed at), the projects extension is missing several scaffold files that both siblings have:
| File | azure.ai.agents |
azure.ai.inspector (PR #8264) |
azure.ai.projects (this PR) |
|---|---|---|---|
.golangci.yaml |
✅ | ✅ | ❌ |
AGENTS.md |
✅ | ✅ | ❌ |
CHANGELOG.md |
22 KB, real entries | 69 B | 53 B, "Initial Version" only |
README.md |
1.3 KB | 3.1 KB, usage examples | 93 B, one-liner |
Specifically:
.golangci.yaml— Both sibling extensions ship the same 198-byte.golangci.yaml. Without it, this module won't pick up the same lint rules as the rest of the extension tree.AGENTS.md—azure.ai.agentshas one (6.7 KB),azure.ai.inspectorhas one (2.9 KB). Both describe build/test commands, output conventions, and the two-PR release flow. The projects extension should ship an equivalent so future agent/contributor work on this extension follows the same conventions — and so the documentedazd ai project set/show/unsetUX, the global-config key, and the legacy-key bridge contract aren't only described in the PR body.CHANGELOG.md— This PR adds ~1,300 lines of net-new functionality (commands, resolver cascade, validation, persistence, legacy migration) but the changelog still reads"0.0.1-preview - Initial Version". At minimum, the 0.0.1-preview section should describe what shipped (commands list, persistence behavior, legacy migration). Aligns with the release prep flow documented inazure.ai.inspector/AGENTS.md.README.md— Currently one sentence. The inspector README documents commands, flags, and examples; agents README documents capabilities. Worth at least listingset / show / unset, the persisted config key, the FOUNDRY_PROJECT_ENDPOINT env var, and the auto-migration behavior so users have a single discoverable surface.
These are scaffold/documentation gaps rather than functional issues, so they don't undo my prior LGTM on the code itself. But trangevi's review is the active blocker on the PR state, and the asks line up cleanly with the missing files above — would recommend either filling them in here or getting explicit sign-off from trangevi that they're acceptable to follow up separately before merging.
Code re-review status: clean on cc74d1979e. PR-level approval still gated on resolving the scaffold-consistency feedback.
Align the new extension with sibling AI extensions (`azure.ai.agents`, `azure.ai.training`, etc.): - `.gitattributes` matches `microsoft.azd.extensions` so any future `*.go.tmpl` files keep LF line endings on Windows checkouts. - `.golangci.yaml` mirrors the shared lint config used across the other extensions (gosec/lll/unused/errorlint, lll line-length 220, gofmt formatter). - `AGENTS.md` adapts the `azure.ai.agents` guide to this extension and documents the project endpoint cascade, the persisted-context store invariants (best-effort legacy reads, idempotent unset, best-effort auto-migration), and the release-prep two-PR flow.
wbreza
left a comment
There was a problem hiding this comment.
Re-reviewed 2a23e08b (chore: add gitattributes, golangci, AGENTS docs) — scaffold-consistency feedback is cleanly resolved.
Verified
.golangci.yaml— byte-identical toazure.ai.agents/.golangci.yaml(samegosec/lll/unused/errorlintenable set, lll line-length 220, gofmt formatter). Lint parity with sibling extensions ✅.gitattributes—*.go.tmpl text eol=lf, matches the convention frommicrosoft.azd.extensions. Future template files won't get CRLF'd on Windows checkouts ✅AGENTS.md— adapted from the agents guide and genuinely tailored to this extension, not just copy-pasted. The bits that stood out:- Cascade is documented in the same order as the resolver code (flag → azd env → global config → FOUNDRY_PROJECT_ENDPOINT → structured error), with the auto-migration called out explicitly.
- "Persisted project context" section codifies the three load-bearing invariants that I'd otherwise have to re-derive from the code on every revisit: best-effort legacy reads, idempotent
clearProjectContextthat clears both keys even on read failure, and best-effort migration inreadAzdHostedSources. Future maintainers will appreciate this. - Error-handling rule mirrors
azure.ai.agents/AGENTS.md(the "create structured errors once, close to the classification site; don't wrap withfmt.Errorfafter that point" rule). Keeps the gRPC serialization contract intact across the AI extensions. - Release prep documents the same two-PR flow (version bump → registry update) used by the inspector/training extensions.
Carryover items (intentionally non-blocking)
CHANGELOG.md is still the bare "0.0.1-preview - Initial Version" stub and README.md is still the one-liner. Looking at the sibling extensions, this matches the actual convention — azure.ai.inspector's changelog is also 69 bytes pre-release, and the AGENTS.md you just added documents that CHANGELOG entries are added at the release PR, not the feature PR. The README staying minimal is also fine for a freshly-scaffolded extension that hasn't shipped yet. So I'm withdrawing those as concerns.
Approval
All prior High/Medium findings from the original review are resolved, the auto-migrate commit (cc74d1979e) was clean on its own, and the scaffold gap raised by @trangevi is closed. No new issues, no regressions.
LGTM �� — re-approving.
@trangevi I have added the missing files. The extension now contains: |
wbreza
left a comment
There was a problem hiding this comment.
Re-reviewed the two new commits — both are docs/scaffold, no code changes.
1af5f1b8 — CODEOWNERS entry ✅
Adds /cli/azd/extensions/azure.ai.projects/ with owners aligned to azure.ai.agents (@jongio @wbreza @vhvb1989 @hemarina @weikanglim @JeffreyCA @tg-msft @rajeshkamal5050 @trangevi @trrwilson @therealjohn) plus @huimiu. Same set as the sibling AI extensions — reasonable.
ac18d77c — AGENTS.md alignment ✅
Two changes:
- Added a "Recommended pattern" code block under Error handling showing the layered convention (lower helpers wrap with
fmt.Errorf("context: %w", err), the top-level command site classifies once intoexterrors.Validation). Concrete and matches the rule stated just above it. - Moved "Release preparation" to sit before "Output:
logvsfmt", matching the section order inazure.ai.agents/AGENTS.md. Pure reordering; no content lost.
No new issues, no regressions. Prior approval stands. 🚀


Summary
Source PR: #8162
Migrates the Foundry project endpoint commands from
azure.ai.agentsinto the newazure.ai.projectsextension. The command surface is now:azd ai project set <endpoint>azd ai project unsetazd ai project showImpact
extensions.ai-projects.context.endpoint.azd ai agent project ...command path while keeping existing agent context readable through the bridge to the new config key.