Skip to content

Add ECS migration reruns and improve ACL failure reporting#25

Merged
drernie merged 13 commits intomainfrom
061-acl-errors
Apr 11, 2026
Merged

Add ECS migration reruns and improve ACL failure reporting#25
drernie merged 13 commits intomainfrom
061-acl-errors

Conversation

@drernie
Copy link
Copy Markdown
Member

@drernie drernie commented Apr 10, 2026

Summary

  • make ECS recovery practical by adding a quiltx ecs run-migration command that can re-run the registry migration task for a stack
  • improve quiltx stack acl failure reporting so apply steps and GraphQL errors are visible immediately, with bucket, policy, role, SSO, and delete failures reported with context instead of disappearing into a generic failure
  • document the Lake Formation rollout failure, root cause, remediation, and rerun procedure so the ACL migration can be completed safely

Breaking change

  • interactive ECS shell access now lives under quiltx ecs shell; bare quiltx ecs shows subcommand help

Testing

  • pytest tests/test_ecs.py tests/test_cli.py -q
  • ./poe test
  • ./poe lint-check

drernie and others added 11 commits April 10, 2026 10:26
Apply step lines (-> add bucket, -> create policy, etc.) now print
regardless of --verbose so users can see which operation failed.
GraphQL error details are always shown, not just in verbose mode.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…them

Bucket add errors were caught and appended to warnings that were never
shown when a later policy mutation crashed. Now failures print to stderr
immediately, policy create/update are also wrapped in try/except, and
warnings include context about which failed buckets a policy references.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Role create/update API errors and SSO config updates were not wrapped
in try/except, so they could crash the entire apply. Now all operation
types consistently catch failures, print to stderr immediately, and
continue with remaining operations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add post-mortem documenting how account-level Lake Formation settings
silently broke all Glue/Athena operations on the bench.dev stack, and
the fix (IAM_ALLOWED_PRINCIPALS grants in the deployment template).

Also add a datasets ACL example showing per-dataset licensing with
composable policies.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update post-mortem with confirmed migration Lambda failure (LF blocked
GetPartitions on named_packages). Add 03-lakeformation-fixes.md
documenting all deployment repo fixes including the new table-wildcard
grants and MigrationCallout DependsOn fix.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@drernie drernie changed the title fix: improve stack acl error reporting and resilience Release 0.7.0: ECS migration runner and ACL rollout fixes Apr 11, 2026
@drernie drernie changed the title Release 0.7.0: ECS migration runner and ACL rollout fixes Add ECS migration reruns and improve ACL failure reporting Apr 11, 2026
drernie and others added 2 commits April 11, 2026 10:27
Add 06-lakeformation-db-fail.md documenting the Glue Database update
NPE. Update 05 status: per-table grants work, CreateTableDefaultPerms
does not (crashes nameless DBs). Deploy 70fa6ae succeeded.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move Python API examples to README_DEV.md, streamline README.md for
CLI users, and add 07-lakeformation-final.md summarizing the deployed
Lake Formation solution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@drernie drernie merged commit d45cf28 into main Apr 11, 2026
1 check passed
@drernie drernie deleted the 061-acl-errors branch April 11, 2026 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant