Skip to content

Add support for cache_model_artifact#241

Merged
shchur merged 7 commits into
autogluon:masterfrom
shchur:cache-model-artifact
Jun 2, 2026
Merged

Add support for cache_model_artifact#241
shchur merged 7 commits into
autogluon:masterfrom
shchur:cache-model-artifact

Conversation

@shchur
Copy link
Copy Markdown
Collaborator

@shchur shchur commented Jun 2, 2026

Issue #, if available:

Summary

Adds FoundationModel.cache_model_artifact() to bundle HuggingFace weights with the serve script into a SageMaker model.tar.gz and upload to S3, then reuse that artifact on deploy(). Required for network-isolated endpoints (e.g., SageMaker Serverless Inference) that can't reach HuggingFace at runtime.

Changes

  • cache_model_artifact(cache_path, *, overwrite=False) — downloads weights via huggingface_hub.snapshot_download, packages weights/ + code/<serve_script> into model.tar.gz, uploads to {cache_path}/{model_id}/model.tar.gz. Returns a new FoundationModel with model_artifact_uri set. Skips upload if the key already exists; raises RuntimeError if the cached artifact's autogluon-cloud-version metadata doesn't match the current version (unless overwrite=True).
  • SagemakerBackend.deploy(..., repack: bool = True) — when False, uses AutoGluonNonRepackInferenceModel directly so a pre-bundled tarball isn't downloaded, repacked, and re-uploaded by the SDK.
  • FM deploy wiring_deploy_backend passes model_artifact_uri as predictor_path, sets repack=False, and overrides hyperparameters["model_path"] to /opt/ml/model/weights (raises if the user also set model_path).
  • Registry refactorFoundationModelConfig is now a frozen dataclass with sensible instance-type defaults; renamed model_nameag_model_key and added model_source_uri. Registry shrunk to chronos-bolt-{tiny,small,base} and chronos-2{,-small} (Mitra entries removed for now).
  • Namingserve_config / AG_SERVE_CONFIGfm_serve_config / AG_FM_SERVE_CONFIG.
  • Serializationto_dict / to_json / from_dict / from_json on FoundationModel, excluding runtime context (role, cloud_output_path) so configs can be shared across users.
  • Dependencyhuggingface_hub>=0.20,<2 added to install_requires.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@shchur shchur requested a review from melopeo June 2, 2026 12:32
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

Job PR-241-16627d5 is done.
Docs are uploaded to https://d12sc05jpx1wj5.cloudfront.net/PR-241/16627d5/index.html

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

Job PR-241-5b5fc59 is done.
Docs are uploaded to https://d12sc05jpx1wj5.cloudfront.net/PR-241/5b5fc59/index.html

@shchur shchur merged commit af03eeb into autogluon:master Jun 2, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant