Skip to content

[ACR][Azure CLI] Improve error/timeout reporting for az acr repository list / catalog enumeration when registry contains extremely high-tag repository (silent partial results) #875

@wayden88

Description

@wayden88

What is the problem you're trying to solve
Description (Customer scenario + problem statement)
Customer uses Azure Container Registry (Premium) as a centralized registry and depends on stable, complete repository enumeration to drive automation (cleanup pipelines). The customer reported that az acr repository list returned inconsistent / truncated repository lists (sometimes a small subset, sometimes more), causing cleanup pipelines to skip repos. The output appeared “successful” and lacked actionable error messaging in default output.

Describe the solution you'd like

Customer expectation: az acr repository list should either:
https://learn.microsoft.com/en-us/cli/azure/acr/repository?view=azure-cli-latest#az-acr-repository-list

  1. reliably return the full repository catalog, or
  2. fail loudly with clear error + reason + actionable hint, rather than returning partial results that look valid.

Customer request: “It would have been faster/easier to catch this issue if the az acr commands provided an error message or timeout with proper logs by default.”

A) Make “partial enumeration” impossible to miss
For az acr repository list (and related enumeration paths), if enumeration does not complete successfully:

  • Return non-zero exit code
  • Emit a clear error indicating incomplete enumeration
  • Include an actionable message suggesting common causes (e.g., throttling/timeouts/large metadata repo) and mitigation steps (e.g., rerun with --debug, reduce tag cardinality in extremely large repos).

B) Add explicit timeout / retry visibility

  • Provide defaults that surface when the command is retrying or timing out (even without --debug)
  • Optionally add a CLI flag like --diagnostic or --verbose-errors to include:
  • last successful page marker / continuation token
  • time spent on last page
  • whether response was truncated due to client timeout or backend throttling

C) Add a “progress / checkpoint” hint for large registries
During listing, optionally show:

  • page count processed
  • last repository name processed (checkpoint)
  • This would allow users to quickly identify “it always stops before repo X,” which is exactly how the customer ultimately found the bloated repo.

Additional context
What we observed

Initial symptom: az acr repository list returns a subset of repos; output differs run-to-run, causing automation gaps.

Troubleshooting performed: We validated behavior using:

  1. az acr repository list (json/table/tsv)
  2. repository/tag checks
  3. data-plane pagination attempts
  4. review of client-side paging logic and token usage

Root cause discovered by customer: One repository was overly bloated with >500k tags. Catalog/list operations would “silently fail” (partial results, no clear error surfaced) before reaching/processing that repo. Customer chose to prune/reduce tags in that repo.

Resolution: Customer will prune tags to bring the repo down to the required level. After that, list operations are expected to stabilize. (A single repository containing >500k tags caused enumeration operations (CLI + paged catalog calls) to fail silently, returning incomplete repository lists without a clear failure/timeout message)

Metadata

Metadata

Assignees

No one assigned

    Labels

    feature-requestIssues that request new features

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions